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PREFACE 


Linear programming has become established as a standard topic 
in mathematics degree courses, and it is often a part of mathematics 
courses for students of economics, computer science, business studies 
and other disciplines. It is one of the basic subjects included under 
the general title of Operations Research. The usefulness of linear 
programming in the real world and the intrinsic interest of the situations 
which lead to linear programming problems make it an attractive area 
of study for many students, especially as the mathematical prerequisites 
(basic linear algebra) are modest. 

This book is intended as an undergraduate text for a course of 
about 40 classes (i.e. lectures and tutorials or examples classes), or 
as a self-paced reading course with very much less teacher-student 
contact. The main parts of the first seven chapters, which develop 
the simplex method, duality and versions of the revised simplex 
method, can be used for a shorter course on general linear programming 
of about 15-20 classes. The background linear algebra needed consists 
of a knowledge of matrices, row and column vectors, their elementary 
properties and techniques of manipulation, the ideas of linear depen- 
dence, bases, matrix-inverses, rank, partitioned matrices and solving 
systems of linear equations. This is all standard material in a first 
linear algebra course and is available in many textbooks, for example 
references {1}, {2}, {3}. There is a good case for the view that 
linear programming should be included in a first linear algebra course, 
because it provides an interesting context for the practice and applica- 
tion which are necessary to understand fully the ideas and techniques 
of elementary linear algebra. However a first algebra course is not 
usually a suitable situation for discussing the practical problems 
involved, and there are valuable benefits (frequently overlooked) which 
can be realised when linear programming is treated as a mathematical 
topic in its own right. 

Linear programming is a subject in which the conceptual and 
manipulative difficulties, although substantial, allow other qualitative 
ideas to be examined and emphasised. One of these, the increased 
familiarity and competence with matrix operations, has already been 
implied, but there are important differences between linear algebra 
“in theory’’ and linear algebra ‘‘in practice’’, and linear programming 
provides a good context in which to introduce these differences. These 
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considerations also lead to the distinction between a ‘‘method’’ and 
an ‘‘algorithm’’, where the first may need human intelligence, com- 
monsense and initiative, but the second must not require any of these 
talents and must be precise and complete and suitable for conversion 
into a computer program. In general the methods developed in the 
book are not referred to or described as algorithms, but an awareness 
of the distinction has significantly influenced the way the methods 
are developed and presented. 

The presentation also stresses the underlying mathematical structure. 
This emphasis is a particularly important feature, especially for 
non-mathematics students, to ensure that the methods do not become 
just a collection of rules; it also means that studying the material 
involves development of a mathematical approach and of mathematical 
maturity. 

The contents are based on a 30-lecture course I have given several 
times at the University of Manchester, with a small amount of extra 
material. The course has been attended mainly by mathematics students 
but also by students of other disciplines, and their reactions have 
greatly influenced the choice of material and the various emphases. 
In particular I have tried to meet the needs of average students and 
to provide an accessible rather than a formal and strictly scholarly 
presentation of the material. This approach does not really handicap 
the dedicated mathematicians and in my experience it often results 
in a more beneficial experience for most students (from a general 
mathematical education point of view). There is no resulting loss 
of rigour. 

It has become fashionable to disparage the tableau approach to 
the simplex method and to favour a ‘‘modern’’ treatment based on 
matrix operations. Certainly the tableau approach is more rudimen- 
tary, but it does not necessarily lead to less insight. A more sophisticated 
mathematical technique generally requires more sophisticated mathe- 
matical experience in order to use it successfully, although its use 
can itself facilitate such experience. I believe that both approaches 
have significant advantages and it is worth spending the extra time 
required to study both. For this reason the tableau operations of 
chapter 3 are interpreted as matrix operations at the end of chapter 
3 but retained in chapters 4 and 5. Then the simplex method is reviewed 
and the duality theorem re-proved in chapter 6, using the matrix 
operations approach. Chapter 7 also uses the matrix operations 
approach. 
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As I have aimed at providing a course textbook I am interested 
in knowing the views of students and teachers who use it, and I 
would be grateful to anyone who can find the time to write and 
let me know their reactions and to suggest improvements. 

It is customary for the author of a book to express his gratitude 
to the people who have helped him to write it. In my case many 
friends, colleagues and acquaintances come to mind who have no 
direct involvement with this book, but for whose very existence I 
am grateful and without whom life would be much less enjoyable. 
William Gossling and Christopher Baker encouraged me to write it 
but bear no responsibility for its shortcomings, nor does Len Freeman, 
who helped to minimise them. 


Will McLewin 

Department of Mathematics, 
Manchester University, 
Manchester M13 9PL. 
England 


June 1980. 


Notation and Abbreviations 


Matrices are denoted by upper-case letters in bold type, for example 
A, and the element in the i-th row and j-th column is denoted by 
a,, the corresponding lower-case letter with suffices i and j, or by 
(A), 

All vectors are column vectors, and are denoted by lower case 
letters in bold type, for example x, and the elements of an n-vector 

x; 
x by x,,%,,...,%,, So thatx = | %2 ]. Row vectors are column vectors 


x, 
transposed, and are denoted by an upper suffix 7: for example 
y’ = (V1,V25 --->¥_) IS a LOW vector in m-space. 

Unit matrices are denoted by I, sometimes with a suffix to indicate 
the size, so that I, is the m x m unit matrix. 

The unit vectors which are the columns of a unit matrix are denoted 
by e,,¢€,,..., and the vector (1,1,...,1)” by e. The column vector 
in m-space which is the j-th column of the m x n matrix A is denoted 
by as, / = 1,2,...,m, and the row vector in n-space which is the 
i-th row of A is denoted by a,., i = 1,2, ...,m. 

Partitioned matrices or vectors are denoted thus (A,,A,) when 
partitioned column-wise and A when partitioned row-wise. This 
is also the notation when either of the matrices A,, A, consists of 
a single column or row, for example (A,b) is the mx (n+ 1) matrix 
whose (n + 1)-th column is the vector b. 

x, 
As already implied, the set of all vectors m2 in n-dimensional: 


x, 
vector space is referred to simply as n-space. 

The rank of a matrix A is denoted by r(A). 

Wherever possible the general element of a vector in m-space has 
the suffix i and the general element of a vector in n-space has the ; 
suffix /. 

The objective function, typically ¢’x, is denoted by f(x) or simply 
by f. When convenient we denote an optimum solution of a linear 
programming problem by x,,,, and the corresponding value of 
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xiv NOTATION AND ABBREVIATIONS 


We also note here that, for example, x is used to denote both 
the name of a vector and the value (i.e., the n actual numbers of 
which the vector consists in any particular example). This does not 
cause confusion and is common practice. 

The notation A D I, has a special meaning which is defined on 
page 21. 

The following abbreviations are used: 


Lp.p. for “‘linear programming problem’’, 

e.c.c. for “‘equivalent cost coefficient’’, see page 28 
bf.s. for ‘‘basic feasible solution’, see page 18 
w.l.o.g. for ‘‘without loss of generality’. 


Each chapter is divided into convenient sections: for example $.3.9 
denotes the fifth section of chapter three. The appropriate section 
number appears at the top of each page. Numbered equations or 
expressions, for example (6), begin with (1) in each section and are 
referred to simply by that number in the same section or in full 
if in another section, for example (6) of section %5.2 

The references listed on pages 210-11 are indicated in the text thus 
{4}. 

Theorems are numbered consecutively throughout the text and the 
sections in which they appear are listed on page 209. The symbol 
w is used to denote the end of the statement of a theorem, and the 
end of the proof. ; 

In several places, statements in the text which may or may not 
require some explanation have been left as exercises for the reader 
(ER). 

There are many different names and notations used to describe 
the features of linear programming problems, particularly when they 
are discussed in the context of a specific application. Alternatives 
are mentioned at appropriate places in the text whenever particular 
names are defined. 

There are two other points of notation, more literary than mathe- 
matical, which should be mentioned here. The use of optimal and 
optimum as appropriate seems to cause more confusion than the 
possible ambiguity of meaning that it avoids. The use of optimum 
as an adjective is mew acceptable\so optimum is used throughout 
and optimal appears only in optimality. In chapter 13, on game 
theory, the distinction between strategy and stratagem is maintained 
although it is a common practice not to do so. 


CHAPTER 1 


A SORT OF INTRODUCTION 


1.1 

Linear programming is concerned with the problem of finding the 
optimum (maximum or minimum) value of a linear function subject 
to a number of linear constraints on the variables. It is a particular 
case of the general mathematical optimisation problem in which the 
objective function and the constraint functions may be non-linear. 
The general problem can properly be regarded as a branch of mathe- 
matical analysis, involving the calculus of functions of many variables. 
The methods for solving the problem are iterative, and use the ideas 
of convergence and rate of convergence. There are many different 
methods which are more or less satisfactory, depending on the 
particular functions involved. In the linear case, one method (the 
simplex method) can be used to solve any problem in a finite number 
of steps. However there are different versions or special methods 
which are more efficient for particular linear programming problems, 
and there is the method of ellipsoids discussed in chapter 9. 

The use of the word programming in this context, and of mathemati- 
cal programming for general optimisation problems, should not be 
confused with computer programming, although in practice non-trivial 
problems would be solved on a computer. Many linear programming 
problems are directly related to real-life situations and the solution 
of each describes the optimum arrangement or programme for the 
situation. 

We begin by considering briefly two such situations, each of which 
leads to a classical linear programming problem (I.p.p.). 


1.2 The Diet Problem 
Imagine a dietician who wishes to determine the cheapest possible 
diet satisfying prescribed nutritional requirements and using certain 
specified foods. 
Let m be the number of nutrients; n the number of foods; b,, 
i=1,2,...,m, the amount of the i-th nutrient required; c 


i Di 9 
j=1,2,...,m, the cost/unit of the j-th food; and a, the number of 
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units of the i-th nutrient in each unit of the j-th food. For convenience 
we may consider a daily diet and all measurements in grams. If x,, 
j=1,2,...,n, is the number of units of the j-th food in the diet, 
then the total cost is 
CX + Os + 5. Ss, 
The total amount of the i-th nutrient is 
G,,%, + An% +... $4,,%,5 
which must be greater than or equal to b,, i= 1, 2, ..., m. In addition, 
to rule out macabre possibilities, there are the constraints ¥,..02 10, 
jJ=1,2,...,n. Thus the dietician’s problem becomes to 
choose x,, X,, ..., x, such that 
Cp Xp Hh iCy Ay ch icc teky 
is a minimum subject to 
@,,%,+a,% +... +4,%, 26, and x,2=0, 
for i = 1, 2, ..., m andj = 1, 2, ..., n. 
We may write this problem as follows: 
minimise f(x) = ¢’x subject to Ax = b, x = 0, (\) 
where A is the m Xn matrix of nutrient coefficients, ¢ the n-vector 
of cost coefficients and b the m-vector of nutrient requirements. 


1.3. The Transportation Problem 

Imagine m sources or depots D,, D,,...,D,, where there are d,, 
d,,...,d,, units respectively of some commodity, and n locations 
or destinations 8,,B,,...,B, which require b,,b,,...,b, units 
of the commodity. The cost of transporting one unit of the commod- 
ity from D, to B, is c,. The problem for the person deciding what 
the transportation arrangements should be is to choose X,, the 
amount of the commodity to be transported from D, to B,, for 
i= 1,2, ....m, j = 1,2,...,m, such that the total cost 27", 27_, ¢,x 


i=l uy 
is a minimum subject to the following constraints: 


27, X, = total amount taken from D, = d,, i= 1,2,..., m, 
27, x, = total amount taken to B= b,, f= 12,41, 


Sy ae. a oe ee SF 
This problem is again to optimise a linear function of the variables, 
the x,, subject to a set of linear constraints. 

Suppose that 2, d, = &, b,. Then in order to satisfy all the destination 


requirements all of the commodity available at the sources must be 


a 
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taken and no destination can be supplied with more than its requirement; 
so in this case we may rewrite the constraints as 


Bei ty Sedat edd, 4 
yn Xy = 5, f = 1,2,...,n, and 
xg 20; b= 1,2)....m, fe iain 
Ih matrix notation the problem is 


minimise c’x subject to Ax = b, x = 0, (2) 
where 
c’ = Riga Cadt 444 © ings Cage Cxbin ey Lies May » Gao Cys 2 Gaga 9. dsne Sede) 
Re Ci Si, etek cdi csi Rh kdow Dh both von den os 


Smads, 6d by by.25 bp" 
and A is the (m+n) Xx (mn) matrix with the following form (ER): 


m rows 
n rows 
n n n 
columns columns columns 
1.4 


The diet problem and the transportation problem provide us with 
several useful insights into the general situation. 

Each problem is a mathematical model of a real situation and the 
solutions are useful only to the extent that the model corresponds 
to the reality. In the diet problem, for example, the cheapest diet 
may be too unappetizing to be acceptable: a diet of dried milk powder 
and soya beans was once suggested for the U.S. army. Also, costs 
are rarely directly proportional to quantities of commodities, either 
of buying or of transporting. This aspect of formulating the model 
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is one we choose to ignore and we just accept some simple-minded 
interpretations of particular situations which lead to l.p.p.s. 

We would not expect the coefficient matrix A in the diet problem 
to have any special structure, and it would change for different foods 
and different identified nutrients. But the coefficient matrix A of 
the transportation problem has a strikingly special structure which 
will be the same for any transportation problem. We can take advantage 
of this special, constant structure to devise an efficient method of 
solution (see chapter 10). 

The commodity in the transportation problem may be continuous, 
for example oil, or discrete, for example pianos. In the latter case 
the vectors b and d will have integer elements and we will only 
be interested in integer solutions. For the transportation problem and 
some other particular problems this presents no difficulties as we 
shall see. There is no single method which is suitable for solving 
integer linear programming problems in general. 


1.5 
It is useful to consider the case of only two variables, because 
then the problem can easily be described by a diagram in the 
(x,,x,) plane. The constraint 
a,x, + a,x,=b 
restricts us to a straight line in the (x,,x,) plane, and divides the 
(x,,xX,) plane into two half-planes: one consists of all points (x,, x,) 
salisfying a,x, + a,x, = b, and the other consists of all points (x,, x,) 
satisfying a,x, + a,x, = b. The same is true, of course, for non-nega- 
tivity constraints such as x,=0, x,=0. The region of the (x,, x,) 
plane satisfying all the constraints for any given problem is thus the 
intersection of a number of half-planes and there are various possibili- 
ties for ihis region (which we shall call R). These are illustrated 
in the diagrams on page 5 where a small arrow 4 indicates, for each 
constraint, the half-plane in which the constraint is satisfied. 
Now consider the objective function /, 
S(%,,%2) = C,X, + €2%3. 

For any two values f, and f,, c,x, + c,x, = f, is a line in the 
(x,,X,) plane, and c,x, + c,x, = f, is another parallel line. In the 
diagram on page 6, which illustrates the case c,c,>0, f,>/, if 
c,>0 and f,</f, if c,<0 (ER). Different values of f correspond 
to different lines c,x, + c,x, = f parallel to the two illustrated. 
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(i) (ii) 


O ih 
R bounded R unbounded 
(iil) 
X, 
—> 
5 Hats 
O 1 
R empty R a Single point. 


As we move in the direction of the arrow the value f of the objective 
function increases if c, > 0 and decreases if c, < 0. 

If we superimpose this diagram on say diagram (i) above, then 
it is clear that the maximum or minimum value of / is attained on 
the boundary of R, and either at a vertex of R or along one side 
of R. Diagrams (ii) and (iii) indicate that there may be no optimum 
solution for some functions f or no optimum solution for any function 
Le 

For more general problems involving n variables x,, x,,...,x,, the 
situation is similar. The points x satisfying a constraint 
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xX, 
x, 
O c 
ot 
ral 7 
at * 
7 ©e 
x cae a 
re * % 
ey 
- 


Gg kp F Oy + 10 OE =O, 
lie on a hyperplane in n-dimensional space and those satisfying 

A,X, + 4,3 %_,+ ... $4, 2 Sd, 
constitute a half-space. An algebraic characterisation of the observa- 
tions made above for 2-space is established in chapter 2 and leads 
us to the simplex method. For the moment we just remark that 
we should not rely on the two-dimensional diagrams too much, useful 
though they are; in particular, we cannot easily illustrate a non-trivial 
problem with equality constraints. A similar comment should be made 
about the numerical examples used to illustrate the theory. These 
generally have two or three constraints and four to six variables and 
this makes them just large enough for useful illustration, but they 
are not really large enough to need the theoretical development. In 
real-life problems of medium size we may have several hundred 
variables, and in large problems many thousands. 


A SORT OF INTRODUCTION 7 


Exercises 1 

1. How is the l.p.p. (1) of section 1.2 for the diet problem changed 
if 

(i) the nutritional requirements are to be satisfied exactly, 
(ii) the nutritional requirements include maximum as well as 
minimum quantities for some nutrients, 

(iii) instead of minimising the cost, the dietician decides to maximise 
the attractiveness of the diet, subject to a maximum cost c, 
by giving each food an enjoyment coefficient p,, j = 1,2, ...,n? 

2. For the diet problem, discuss the effect of changing the units. 
Suppose, for example, that measurement by volume was preferred 
to measurement by weight for some food. 

3. Howis the /.p.p. (2) for the transportation problem and the optimum 
solution changed if all transportation costs from a particular depot 
or to a particular destination are increased by k (>0)? 

4. A manager of a company wishes to supply n of the company’s 
factories with specified quantities of a certain raw material. The 
company advertises its desire to buy this raw material and receives 
offers of specified amounts and prices from m suppliers. The 
manager works out the (mn) transportation costs and then has to 
decide how much to buy from each supplier and which factories 
to supply with it. Formulate the manager’s problem as a [/.p.p. 
of transportation type. 

This is a simple version of the contract-award problem. Suggest 
some likely additional complications in practice. 

5. A manufacturer has amounts b,, i = 1, 2, ..., m, of m resources 
which he uses to make n products. He knows the amount a,, of 
the i-th resource needed to produce one unit of the j-th product, 
and the profit c, he makes on one unit of the j-th product. Express 
as a lp.p. the manufacturer’s problem of choosing how much 
of each product to make, so that his total profit is maximised 
subject to his available resources. (In this context the elements 
a, are input-output coefficients, sometimes called requirement or 
activity coefficients.) 
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6. Examine the following I.p.p.s graphically. They illustrate the various 


situations that can occur. In each case x,, x, = 0. 


(i). 3x, +.5x, 3 15 (ii) 3x, + 5x, <= 15 
5x, + 2x, = 10 5X, 2x5 S10 
maximise 5x, + 3x,. maximise 2.5 x, + X,. 
(iii) x,- x,2-1 (iv) x,+ x,s1 
—X,+2x,s 4 2x, + 2x, = 4 
maximise 2x, + 2x,. maximise 3x, — 2x). 
(v) x,-—x,= 0 (Vip —x;'+ 3, = 1 
3x, — x, = -—3 ¥, + %,'= 1 
maximise x, + X). maximise C,X, + C,X. 
. Discuss the advantages and disadvantages of ‘‘simple’’ mathemati- 
cal models. 
. By means of a simple diagram in the (x,,x,) plane, show that 


we cannot solve a /.p.p. in which an integer solution is required 
by finding the optimum general solution and then taking the nearest 
“integer point’’ to this solution. 
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NOTES 
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NOTES 


CHAPTER 2 


CONVERSION TO SPECIFIED FORM; BASIC, 
FEASIBLE AND OPTIMUM SOLUTIONS 


pm | 

Constraints involving > or < do not concern us. Mathematically 
they define open sets of points on which a function may approach 
arbitrarily close to an optimum value but not actually attain it. In 
the few cases in which they are appropriate in practice, they can 
usually be easily replaced by meaningful constraints involving 
= or Ss. 

As we have seen, the constraints in a |.p.p. may involve =, <, 
= or a mixture. We now see, by means of simple examples, how 
the nature of constraints can be changed by the introduction of extra 
non-negative variables. 


(i) Inequalities can be reversed by multiplying by —1. The inequality 
constraint 
EX et 38, se 1S 
is equivalent to the constraint 
2x, — 3x, 2 —5. 
(ui) Inequality constraints can be converted to equality constraints 
by introducing slack or surplus variables. The constraint 
2%, it Sky SS 
is equivalent to the two constraints 
Rhy Sky + Xp 5, x, =O. 
Here x, is called a slack variable: it tells us how much slack 
there is before the constraint becomes active or binding. 
The constraint 


2a, 3h, 5 
is equivalent to the two constraints 
2x, — 3x, — x, = 5, x, = 0. 
Here x, is called a surplus variable. 
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(iii) Equality constraints can be converted into pairs of inequality 

constraints. The constraint 

2x, — 3x, =5 
is equivalent to the two constraints 
2%, — 3%, 5, 2x, = 3x, eS. 

Note that an equality constraint cannot be converted into an 

inequality constraint by the introduction of a slack or a surplus 

variable (ER). 
(iv) A variable not restricted in sign is called a free variable, and 

can be replaced by the difference of two non-negative variables. 

So the constraints 

2X, ate =e D. 2eid Bee bs (1) 
where x,, x, are free variables, are equivalent to the constraints 
22; = 3zy-hasys'5,; 
i, + 2-22, 21, 2,22, =6, 
where x, = 2, = 23, %5 = 2, > 2, 
The constraints (1) are also equivalent to the constraints 
22, — 22, — 32, + 32, + 2, = 3, 
i> fF Be 204%, SF, F522, 2 9; 

where x, = z, — Z,,X, = Z, — Z wae Z,, Z, are slack variables. 

(v) We also note that 


maximum (2x, — 3x,) = — minimum (— 2x, + 3x,). 
xER 


2.2 

Two particular forms of l.p.p. which we shall call standard form 
and canonical form are of special interest. 

It should be pointed out that it is not unknown for these names 
to be given different meanings and for these forms to be given different 
names. 


(i) Standard Form 
minimise f(x) = c,x, + C,x, + ... + ¢,x, subject to 
GX, + Ay2y 4. 40,3, = 6, 


Ga By + Oy Xy, A ext Os, Xa, SO 


44, + 6.9% t+ O35, = bend 


mnvon 


Ri Ses 1a Se, 
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In matrix notation, we have 


minimise ¢’x subject to (1) 
Ax =b, x20. 
(ii) Canonical Form 
minimise ¢"x subject to (2) 
Ax =b, x20. 


Canonical form is particularly important because /.p.p.s are convert- 
ed to this form before they are solved using the simplex method. 

We can always arrange that b = O for a /.p.p. in canonical form 
without changing the form of the constraints (ER). This is a vital 
condition for the development of the simplex method, and so whenever 
we are concerned with solving an /l.p.p. we include b = 0 as part 
of the definition of canonical form. 


2.3 
Conversion from standard form to canonical form just requires 
the introduction of m surplus variables, so that the constraints become 


AR OL” ae Ie ae an 8 (3) 
pee: one ey me FD 
So minimise ¢’x subject to Ax = b,x = 0 becomes 
minimise €'X subject to AX = b, x = 0, (4) 
where 6" & (€,, €,,; |.., €])'0,.0) “.., 0) = (c7, 07), 
ee ee ee £))' (a",'2"), 
b = b, and 
Qn 42 ie! ae 
ie a2 GH" is “e : =f... : chk a 
Boag ue.nege us Oil 


Here I, denotes the m X m unit matrix and 0, the zero vector with 
m elements, and we have used the idea of partitioned matrices. 

In this case, as an example, the m equations of (3) (and the same 
m equations of (4)) are the same as 


(A, -1,,) (7) = b= Ax 1,2 = Ax ~ 2. (5) 


Also, it is worth emphasising that (1) and (4) really are the same 
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Xo 


linear programming problem. If ( a ) satisfies the constraints of (4) 


0 
then x, satisfies the constraints of (1), and any x, which satisfies 


0 
of (4). The same argument holds for the vectors at which the minimum 
is attained. 


the constraints of (1) defines az, such that ) satisfies the constraints 


2.4 

From now on we will assume that the /.p.p. is in canonical form, 
with b = 0. This can always be achieved by the methods of sections 
2.1 and 2.3 although, as exercises 3.3 and 4.6 indicate, it may be 
more efficient to do something else. 

For the moment consider just the equality constraints Ax = b, where 
A is mXn, and suppose that m > n. Either b is in the column space 
of A or it is not. If it is not, then there is no x such that Ax=b, 
and b is not a linear combination of the columns of A, so we do 
not have an /.p.p. to solve because the constraints cannot be satisfied. 
If it is, then the rank of A, r(A), is the same as the rank of the 
augmented matrix (A,b), and each is at most n. So at least (m—n) 
rows of (A,b), i.e. at least (m — n) constraint equations, can be removed 
because they are linear combinations of the remaining n equations. 
This takes us to the case m = n. 

If m = n and b is not in the column space of A, then r(A, b) 
> r(A), and again there is no vector x satisfying the constraints so 
we do not have a /.p.p. to solve. If b is in the column space of 
A, then there is a unique solution x, which is the solution of a 
corresponding /.p.p. in canonical form provided that x => 0. If, however, 
r(A,b) = r(A) = k <n, then (n—k) equations can be deleted, and 
this takes us to the case m < n. 

So we can now assume that A is m xn with m < n and we also 
assume that r(A) = m. These assumptions are a matter of convenience 
for the development of the simplex method; problems where they 
are not the case can be dealt with automatically as we shall see 
in sections 4.3 and 4.5. In practice we do not have to perform the 
preliminary manipulations that the analysis above implies are necessary 
to ensure that m < n and r(A) = m. 

We have denoted by R the set of vectors satisfying the constraints. 
Thus 


R= {x/Ax =b, x=90). 
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We refer to R as the feasible region of n-space and we say a solution 
x of the equations Ax = b is feasible if x = 0. 

A feasible solution for which ¢’ x is a minimum is called an optimum 
solution, and the value f(x) = ¢’x of such a solution the optimum 
value (of the l.p.p.). 


2.5 Convex Sets 

A set S is said to be convex if x,,x, € S, 0<a<1 implies that 
y = ax, + (l—a)x, € S. 

The point y in S is said to be a convex combination of x, and 
x,. (Strictly speaking, y is a convex combination of x, and x, if 
y = ax, + (1 —a@)x, for some a satisfying 0 < a < 1, and a proper, 
or nontrivial, convex combination if 0 < a < 1.) 

The set of all convex combinations of x, and x, is the set of points 
on the straight-line segment joining x, and x,, and so a convex set 
contains the line segment joining any two points in the set. 

A general convex combination of r points x,, x,, ..., x, is any 
y where 

Fee) E a2 0, i= 1,2.....7: F 4, = 1b, 
and we can prove, inductively, that if S is convex, x,,x,,...,x, € 
S then y € S. Alternatively, any set of points x,, ..., x, defines 
a convex set that consists of all points which are convex combinations 
of them. 

A half-space is convex, and so is the intersection of any finite 
number of convex sets (ER). This establishes that R is convex, but 
we prove this result directly. 


Theorem 1 

The set R of feasible solutions to a l.p.p. is convexg 

Suppose x,,x, € R. Then Ax, =b, Ax,=b, x,=0, x, 20. 
So y= ax, + (I — a)x, = O because for each element y, of y 

y, = a(x,), + tir a)(x,),, 

which is the sum of non-negative quantities. 

Also Ay = A(ax, + (1 — a)x,) = aAx, + (1—a)Ax, 

= ab +(l—a)b=b. 

Therefore y € R, and therefore R is convex 

In 2-space, as the diagram (i) on p. 5 indicates, if R is bounded 
but non-trivial, it is a polygon with no re-entrant vertices and the 
optimum value of any objective function will be attained at a vertex. 
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It may possibly be attained at all points of one of the sides of the 
polygon (see exercise 1.6 (ii)). One can picture the corresponding 
situation in 3-space where the boundary of R consists of sections 
of planes. The situation is essentially the same in n-space where 
the boundary of R consists of sections of (n — 1)-dimensional hyper- 
planes and R is called a polytope. The picture is more difficult to 
imagine for a .p.p. in canonical form where the set of solutions 
of the equality constraints has no interior points. However, in every 
case the crucial points of R are the vertices, also called extreme 
points, which are formally defined by their characteristic property 
as follows. 

A point x of a convex set S is an extreme point (or vertex) of 
S if it cannot be written as a proper convex combination of two 
distinct points of S. At one of the extreme points of R, at least, 
the optimum value of the objective function is attained. To establish 
this rigorously using a geometrical approach for /.p.p.s in general 
is rather tedious, partly because of the possibility of R being unbounded, 
and it would be simpler and sufficient for our purposes to establish 
the following theorem. 


Theorem 2 

If a Lp.p. has a finite optimum solution, then the optimum value 
is attained at an extreme point of the feasible regiong 

Proving this theorem is the natural next stage in the development 
of linear programming theory, and the result is a vital step in the 
development. However, the rather tortuous proof is quite different 
in spirit from the rest of the development so the proof is not presented 
here but is to be found in Appendix 1, and the result is established 
by a different approach in theorem 4. 

Another alternative approach is outlined in exercise 2.4. The very 
reasonable assumptions stated in that exercise can be established with 
certain provisos, but this is a rather tedious business and the task 
in exercise 2.4 is just the satisfying end product. 

The nature of the proof of theorem 2 indicates that a geometrical 
approach has severe limitations, and an algebraic approach is needed 
to actually calculate solutions of /.p.p.s. Indeed, the definition of 
an extreme point indicates this, because although it is clear and succinct 
there is no obvious way to use it in practice, even to decide whether 
a particular point of R is an extreme point. So, with the insights 
provided by the geometrical approach, we now change to an algebraic 
view of I.p.p.s and develop a computable characterisation of extreme / 
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points. This is a crucial step in the development of a computable 
method or algorithm for solving /.p.p.s, and it is worth noting that 
the essential difference is that from an algebraic point of view the 
points of R are described with respect to a coordinate system whereas 
the geometrical approach is coordinate-free. ) 


2.7 Basic Solutions 

In canonical form, the equality constraints are Ax=b, where A 
is mXn and m<n. We are assuming that r(A)=™m, so that there 
are m columns of A which are linearly independent and without loss 
of generality (w./.0.g.) we can assume that the first m columns of 
A are linearly independent since this could be arranged simply by 
renumbering the variables x,, x,, ...,x,. Remember that Ax = b is the 
same as 


By May * Xs ts. +2, 8, & D, 
where a,, is the j-th column of A, for j=1,2,...,n. If we put 
Xm+t =%m42 = ++. =X, =O, then we have a system of m equations in 
m variables x,, x,, ..., X,,- The matrix of coefficients A, is non-singular 
so there is a unique solution 
1 =A,'b A), = gy dy = 1, 2... m, 
which gives us a solution x, 
eT Vm 1,2). .05f0 
xX,= 


J 


0, j=m-+i1,m+2,..., n, 

of the constraint equations in which at most m of the variables are 
non-zero and these non-zero variables correspond to independent col- 
umns of A. Another way to write this operation and one we make use 
of frequently, is to partition A into A,, (mx m), and A,, m Xx (n— m), 
where 


m Qi Qi m+2 ++ GQ, 


Qyim+t 42,m+2 ++» Fn 


moem+l 


a a Qn m+2 Ser ee 
and to partition x into x, and x, conformally, so that 


x, oh Xm 
x= sR Spr Hietiehsy xXpe : 


x 
3 x x 


m “a 
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Then Ax = b = A,x, + A,x,; notice that both A,x, and A,x, are 
m-vectors. Putting x, = 0 gives x, = A,'b since A, is non-singular. 

A solution of the constraint equations in which the non-zero variables 
correspond to independent columns of A is called a basic solution. 
At least one basic solution exists if r(A) = m because there are 
m independent columns, and in general there are many basic solutions. 

The m non-zero variables, x,, ..., x,, above, are called the basic 
variables and the (n—m) zero variables, x,,,,,...,x, above, the 
non-basic variables. The m columns of A corresponding to the basic 
variables form a basis for m-space and each column of A can therefore 
be expressed as a unique linear combination of those m columns. 

Of course it may happen that some of the variables in x,, k say, 
are zero. In this case we say x is a degenerate basic solution. For 
a degenerate basic solution, we will regard k of the (n — m — k) variables 
with value zero as basic variables. Any k of the zero-valued variables 
can be used provided that the complete set of m basic variables 
correspond to a non-singular m x m sub-matrix of A. This suggests, 
correctly, that a degenerate basic solution is equivalent to a number 
of coincident basic solutions. 

For example, if 


Ac ‘ ; ) and b= Hi; there are three basic solutions, 


0 2 | 

2 #, Ore tf 1 4. 

l -I 0 
obtained by putting x,, x,, x, equal to zero in turn. 


If b = (7) however, we obtain 


0 I l 
Re TOT, 1 Ot 
I 0 0 


where the last two have x, and x,, and x, and x, as the basic variables 
respectively. 


2.8 Basic Feasible Solutions 

If a basic solution x of the constraint equations Ax = b is non-nega- 
tive, i.e. x = 0, then it is a feasible solution and we have a basic 
feasible solution (bf.s.) of the Lp.p. in canonical form. It should 
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come as no surprise that b.f.s.s correspond to extreme points of 
R, which we now establish in the two halves of theorem 3. 


Theorem 3 
(i) A b/f.s. x is an extreme point of Ry 
We assume w./.0.g. that x,, x,,...,x, > O(and that x,, x,, ..., x,, 


Xe yuo +++) X,, are the basic variables although we shall not need 
this). 
. : 
POP 8 Kye) By ee). and ‘suppose 


x is not an extreme point of R. Then there are a, y, z, with 
O<a<l,y #zandy,z € R, such that 
ay +(l-—a)z=*«x. 
This means that ay, + (1—a)z, = Le 
implies y,=z,=0. Hence, with y,=(),,)),...),) 
Z,=(z,, Z,,...,2,)’ and A, the first k columns of A, 
Ay= A,y, + A,y,=A,y, = b=A,zZ,. 
Hence A,(y,—z,) = 0, which implies that either y, = z, or 
the columns of A, are linearly dependent, neither of which 
is true. Thus x is an extreme point of Rg 
(ii) If x is an extreme point of R, then x is a b/f.sg 
We may assume w./.o.g. that x,, ..., x, > 0 and x,,,, ..., x, 
= 0, for some value of k, as in part (i). Partitioning x and A 
as we did in (i), we have 
A,x,=b, x, >0. 
Suppose the columns of A, are linearly dependent. Then there 
is a k-vector y, such that 
A,y,=0 and y, £9. 
Now for A > 0, but sufficiently small, 
x,. + Ay, > 0, 
A, (x, + Ay,) = b, 


$0 Zz, = ‘ye AY: ) and z, = evn AY, ); both belong to R and 
n~—k n—k 

z,#2,. However x=452z,+52z,, which implies that x is not an 

extreme point of RX. This contradicts the definition of x and so 

the columns of A, are independent. Since A is mx n and r(A)=m 

we have k=™m, and if k<m then (m—k) of the (n—k) columns 

of A, can be chosen to form a set of m independent columns 


which for j > k 
” and 
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(see exercise 2.10). These (m—k) columns of A, define the 
remaining (m — k) basic variables and hence x is a b.f.s.— 


2.9 Theorem 4. The Fundamental Theorem of Linear Programming 

If the Lp.p. minimise c’x subject to Ax = b, x = 0, where A 
is mxXn and r(A)=™m, has a feasible solution then it has a b/-s., 
and if it has an optimum solution then it has an optimum b./.s._— 

To establish the first assertion, let x be a feasible solution and 
assume w.i/.o.g. that x,,x,,...,.%, > © and X15 Xp529--.0%e Ge ME 
the first k columns of A are linearly independent then k= m. If k=m 
then x is a b/f.s. If k<m then x is a degenerate b.f.s. and (m—k) 
of the zero elements of x can be chosen to make up the m basic 
variables. 

If the first k columns of A are not independent, then there is a 
k-vector a 4 0 such that 


= a,a,,= A,a = 9, and therefore 
Yi. (x, — 9a,)a,,=b forany 8. 


We can arrange that at least one element of « is positive, and 
so as @ is increased from 0 for some value of 0, 8, say, and some 
s, lsssk, x, — 0,a, = 0, x,-—0,a,>O0 fori s. 

Denoting x, — 0,a, by x/, we now have a new feasible solution 
x’, with at most (k — 1) non-zero elements. The process can be repeated 
until the columns of A corresponding to non-zero elements of the 
feasible solution are independent. 

The second assertion of theorem 4 is left as an exercise (see exercise 
2.11)m 

Notice that despite the constructive nature of the proof, theorem 
4 does not directly provide a method of solving /.p.p.s because it 
does not provide a means of obtaining the feasible or optimum vector 
x with which the proof begins. Referring back to section 2.7 we 
see that indirectly theorem 4 does provide us with a method for 
solving any /.p.p. because it establishes that it is sufficient to find 
the value of the objective function at all the b.f.s.s and choose the 
optimum of these. Unfortunately this requires us to solve up to 
n!/(m\(n—m)!) mxm_ systems of equations, which is extremely 
inefficient. In the simplex method in practice, we can expect to do 
on average about as much work as there is in solving two m xm 
systems of equations. Even for n, m quite small, n = 20, m= 10 say, 
n!/(m!(n — m)!) is rather large, 184,756 in fact. 
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However, it is now clear that solving a /.p.p. only involves solving 
a non-singular system of equations and the difficulty is that of deciding 
which system to solve. 

We also observe that if the columns of I,, are present among the 
columns of A then we have one b//.s. immediately: namely 
x,= 6, if the j-th column of A is the i-th column of I, and x,=0 
otherwise. In particular if we partition A into (A,,A,) and A,=I,,, 


then 
= x, = b i 
he, (0) isa b/.s. 


We shall use the notation “A DI1,,”’ (A contains I.) to denote 
that the columns of I,, are present among the columns of A. This 
is just a convenient use of notation and does not mean, for us, that 
each of the columns of I, can be written as a linear combination 
of the columns of A, which is true for any m Xn matrix with rank 
m. 
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Exercises 2 


2. 


2. 


4. 


Convert the following /.p.p.s to canonical form: 
(i) maximise 3x, — x, (ii) maximise 2x, + 3x, (iii) minimise x, — x, 


subject to subject to subject to 

2x, + 3x, = 6 4x, + 2x,=4 X, — X% = 
x, + 7x, 24 Xj. 3x,fs35 X, +98, ix, 4 
Xy + %,=3 yi xy eo. x, =10. 

ig X= 0: 


Repeat section 2.3 but instead convert from canonical form to 
standard form. 

Suppose that the optimum solution of a l.p.p. in canonical form 
is not unique. Show that the set of all optimum solutions is convex. 
For a l.p.p. in canonical form, given that R is bounded and has 
a finite number of extreme points and that any point of R can 
be written as a convex combination of the extreme points of 
R, prove that the optimum value is attained at an extreme point. 


. In the diet problem of section 1.2, give a physical interpretation 


of the fact that the optimum solution will be basic. Give an 
interpretation of the following possibilities: 

(i) one row of A is zero, 

(ii) one column of A is zero, 
(iii) r(A) = (m— 1). 
Does (iii) imply that the optimum solution will be degenerate? 
Convert the diet problem to canonical form by introducing m 
surplus variables. What does it mean if one of them is a basic 
variable in the optimum solution? 


. Convert the manufacturer’s problem, exercise 1.5, to canonical 


form and give an interpretation of the possibility that a slack 
variable is basic in the optimum solution. Note that each column 
of the matrix of coefficients A describes one of the manufacturer’s 
activities. The slack variables and their corresponding columns 
in canonical form are called disposal activities. 


. Solve the following l.p.p.s by finding all b.f.s.s: 


(i) maximise 2x, + 3x, (ii) minimise (1,-1, 1, 1)x 
subject to subject to 
4x, + 2x, +x, =4 ee 2 5G 3 
hy ee 6446) 


Kus Xi 5, we O. x>0. 
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8. 


Suppose that x and y are two distinct non-degenerate extreme 
point optimum solutions (b/f.s.s) of a l.p.p. in canonical form. 
Show that any non-trivial convex combination of x and y is an 
optimum solution but not a b.f.s. 


. The unit matrix I, appears in the matrix A for the system of 


equations below and hence there is an obvious basic solution. 
By performing suitable row operations on the augmented matrix 
(A, b) convert (i) the fourth (ii) the fifth column of A to a column 
of I, and hence obtain basic solutions in which x,, x, respectively 
are basic variables. 


r= @ B43 4 
ue 4.0 =<) P¥ee83 
o 0 | .4 -2 6 


. Suppose that the mXxn matrix A, with m < an, has rank m. 


Prove that for any set of k independent columns a further 
(m — k) columns can be found to complete a set of m independent 
columns. (Just consider the remaining columns one at a time and 
add them to the set we already have if they are independent.) 
Prove the second part of the fundamental theorem of linear 
programming (see section 2.9) by using the same argument that 
was used to prove the first part and noting that 

e’x = Xi, ¢,(x, — Oa,) 
because (x,— 0a,)a,, is a feasible solution for @ small enough and 
positive or negative. 
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NOTES 


CHAPTER 3 


THE SIMPLEX METHOD" 


at 
We first examine a simple example and solve it by a methodical 
but essentially intuitive process. 
Minimise 2x, — 3x, subject to 
Ee x3 2, 
Dk, a 8) ee S| O, 
This problem is easily solved graphically. 


The minimum occurs at A, which is the point (¢, *), and there 
J (x) = 2x, — 3x, = —46/5. (1) 
Converting to canonical form with two slack variables x, and x, 
gives 
minimise 2x, — 3x, + Ox, + Ox, 
subject to 1 
—2x, + x, +x, = 2, (2) 
By H 2h, Xe BE, Kys Xp Hey FSO, (3) 


25 


a 


26 LINEAR PROGRAMMING AND APPLICATIONS §3.1 


or minimise ¢’ x subject to Ax = b, x = 0, 


where ce = (2,-3,0,0), (4) 
x; 
2 ae Oe Ye x 

b= (5). a= or 1} eS > 
X4 


In canonical form we see that A DI,, so there is one obvious b.f.s.: 
x, = x, = 0, x, = 2, x, = 8. This corresponds to the point O in 
the diagram, and the value of f there, f,, = 0. 
The three other basic solutions that are feasible are: 

x,=x,=0, x, = 2, x,= 4, the vertex B, f, = —6, 
x,=0, x, = 8, x, = 18, the vertex D, f, = 16, 

xX, = x,=0, x, = 4/5x, = 18/5, the vertex A, f, = —46/5, 
which confirms that the optimum value of /(x) is attained at A. 


Xy 


Notice that in (1) or (4) we effectively have /(x) expressed in terms 
of the non-basic variables x, and x, of the basic solution provided 
by the columns of I, in the matrix A. For any other basic feasible 
solution one of x, and x, must be positive, since x, =x, =0 implies 
x, = 2, x,= 8, so we ask whether increasing x, or x, from its current 
value of zero will decrease f. Since the coefficient of x, is negative 
we will decrease fif we increase x,. By how much can x, be increased? 
The equalities in (2) and (3) must be maintained, so as x, is increased, 
with the other non-basic variable x, remaining zero, the basic variables 
x, in the first equation and x, in the second must be changed. 

For example, in —2x,+x,+x, = 2, if we increase x, from zero 
to @ say, we must decrease x, by @, so that the maximum value 
x, can be given is 2, because anything higher would make x, negative. 

This is possibly easier to follow if we rewrite the equations (2) 
and (3) as follows, 

X,=2+ 2x, —x,, (6) 

x4 = 8 — x, — 2x, (7) 
and ask how large can x, be made, with x, = 0, before x, or x, becomes 
negative. Equation (6) is the crucial one and gives x, = 2. However, 
with x, = 2 and x, = 0 we have another bf.s., with x, = 4 from 
(7),.and this is the vertex B. The value of f(x) at B is —6 and 


Sn = —-6=0 +2 x (-3), 
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which is the previous value f,, plus the increase in x, multiplied by 
the coefficient of x, in (1). 

In order to use the same argument for the new b.f.s. we need 
to convert (1), (2), and (3) to the same form, namely /(x) in terms 
of non-basic variables x,, x,, and equality constraints which express 
the basic variables in terms of the non-basic variables x,, x,. Rearrang- 
ing (6) and substituting for x, in (7) gives 


Xp = 2h 2x5 Xy, and (8) 
x, = 84, 22+ 2%; = x5), 
1.e. xX, = 4— 5x, + 2x,. (9) 


Substituting in (1) for x, gives 
f = 2x, —3(@ + 2x, — x,) 

= —6 — 4x, + 3x;. (10) 
Equations (8) and (9) show clearly what has been done, because the 
new matrix of coefficients of the equality constraints still contains 
I,, but now the columns of I, correspond to x, and x,. Also, the 
cost coefficients in (10) corresponding to basic variables are zero. 
Thus, putting x, = x, = 0 gives 

x, =2,,%,=4, f=f, = —6. 
From (10) we see that we can further decrease f by increasing x, 
(with x, remaining zero), and from (8) and (9) the maximum value 
we can give x, is 4/5, which makes x, zero (and gives x, = 18/5). 

The same rearrangement and substitution gives 


x, = 4/5 + 2/5x, — 1/5x,, (11) 
y= 16/5 — 1/5x, — 2/5x,, and (12) 
Sf = —46/5 + 7/5x, + 4/5x,, (13) 


which corresponds to the vertex A. 

Both (all) coefficients of non-basic variables in (13) are positive 
and the current b.f.s, is obtained by giving them the value zero. 
In any other b.f.s. (or any other feasible solution for that matter) 
at least one of x, and x, will be positive and so f will be greater; 
thus the current b.f.s. is optimum. 


The procedure developed intuitively above is in fact the simplex 
method. We now go on to describe it formally in general, and to 
establish an alternative way of describing the operations that are 
performed. This will, of necessity, appear more complicated, and 
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it is possible to make the siniplex method appear very complicated 
indeed, so it is important to bear in mind the essential simplicity 
of the argument and the operations used. 

Two comments may help: 


(i) The pairs of equations or equality constraints numbered (2) and 
(3), (6) and (7), (8) and (9), and (11) and (12) are all equivalent. 
They define the same region of 4-space and hence, together with 
x = 0 and f(x), they define the same /.p.p. 

The equations EAx = Eb, i.e. A’x = b’, where A’ = EA and 
b’ = Eb, are equivalent to the equations Ax = b for any non-singular 
matrix E, and there is a 2 x 2 matrix, E, say, which transforms 
(2) and (3) to (8) and (9). We did not identify it at the time, 


but it is in fact 
1 0 
E, =(, + 


(ii) The different expressions for f(x) given by (1), (10) and (13) are 
also equivalent, and although each has been associated with a 
particular vertex of R each is valid throughout R and each has 
the same value for any point of R. The fact that (10) and (13) 
involve a constant as well as a linear combination of the variables 
does not make them different from (1). In general, referring back 
to sections 2.2 and 2.4, we could have assumed that f(x) had 
the form e’x + a, for some constant a, because this constant 
essentially leaves the /.p.p. unchanged. 


3.2 

We assume that the /.p.p. has been converted to canonical form. 
It is then defined by A, b, and ¢ and we shall refer to the elements 
of this A, b and ¢ as the original coefficients. 

The simplex method requires the matrix A and the vector ¢ to 
have a certain form, namely AD I,, and c, = 0 if x, is a basic variable. 
If this is the case, the method may begin at once. If not, some 
preliminary manipulation is required (which we describe later in 
sections 4.2-4.5) to produce an equivalent set of constraint equations 
which we denote using A’ and b’ say, and a set ec’ of equivalent 
cost coefficients (e.c.c.s.). 

The equivalent cost coefficients c/,cj,...,c/ are often called 
relative cost coefficients or reduced cost coefficients, and sometimes 
modified cost coefficients or simplex criteria. The name equivalent 
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is preferable since it describes exactly what they are, just as A’x = b’ 
is an equivalent set of equality constraints. Although any particular 
set ce’ of e.c.c.s appears associated with or relative to a particular 
bf.s. the corresponding expression c’’x— a’ for f(x) is valid at all 
points of R (see section 3.6). Further comments on nomenclature 
appear in section 5.7. 
One stage of the simplex method produces a new b.f.s. and a new 
set of coefficients, so we shall describe such a stage in terms of 
ee. (4,00, .0,,1=-1.2,..My Js 12... .a)and 
B, O*,¢’, (a7, 67, c?, tml 2.t.4m fs ble: 
For convenience and w./.o.g. we assume that the first m columns 


of A’ are the columns of I,, in order. The situation can then be 
completely described by the following tableau of coefficients: 


We call such a tableau a simplex tableau. 

Here we have emphasised the ¢-th column and the s-th row for 
reasons which will become clear shortly. The current b./.s. is x, where 
x vs [he he 

0, j=m+1,m+2,...,n. 
The current basic variables are x,,x,,...,x,,. The i-th row of the 
tableau, i = 1,2,...,m, is a shorthand for 


x; 


Os; + Ox, i+ Ox, 45, + Ox, +... + ala pre), 


, r , , Pes | 
T Mie hs tt a, %, + ... + 2%, = B,, 


and the last row of the tableau which we will refer to as the ‘c-row’ 
of the tableau is a shorthand for 


Reg 06, Ft Oe Cok, + Chas Xuca ¥ «- 

+ ¢/ x4... + cx =f) + a's Q) 
Since x,,,; = X42 =... = X, = 0 for the (current) corresponding b.f.s. 
the value of this bf.s. is —a@’ and in practice, when the symbolic 
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coefficients are replaced by numbers, the f can be omitted and just 
the value of a’ placed in the bottom right-hand corner. 


One stage of the simplex method thus proceeds as follows: 
If c; = 0, 7 = m+1,m+2,...,n, the current b/f.s. is optimum, 
since at present x,,,,, X,,,95---.X, are zero and in any different b/s. 
at least one will be positive and thus f can only be increased. If 
at least one e.c.c. is negative choose ft such that c/ = min C 


Jum+lm4+2,..n Le 


Now /f can be decreased by increasing x, from its current value of 
zero. Either (i) a}, < 0, i= 1,2, ...,m, 
* or (ii) a), >0 for some i, 1 = i= m. 


(i) If x, is increased to @ say (@ > 0), then in the i-th constraint, 
equality is preserved by adding —a/,@ to x, for i = 1, 2, . 
m, and 

x,(0) = x,-— a),020, i= 1,2, ...,m, so x(0) defined by 
x,(0) fj = 1,2, ...,m 
(x(9)),= )9 som jAt (3) 
0 j=! 
is a feasible solution for any @ > 0. However, the value of this 
solution f(x(@)) is 


“9 


—a’+c/6, (4) 
which can be made less than any chosen number K say by choosing 
6 sufficiently large. 

Therefore c/ < 0, aj < 0, i = 1,2,...,m, implies that the Lp.p. 
has no minimum solution—the values of feasible solutions are 
unbounded below! 


(ii) If x, is increased to @ say (9 >0), then in the i-th constraint, 
equality is preserved by adding —a/,@ to x, for i = 1,2,...,m, 
but now —a,,@ <0 for some i so the maximum value that we 
can choose for @ is 


x; x, 
min —=— sa 
i=1,2,....™m a’ a’ Ys 
aj,>0 “ . 
which is equal to 
, , 
b. P b; 
— 5 in =, (5) 
as, = i, a, 
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Ifthe minimum is attained for several values of i, to make the procedure 
definite we choose, at present, the minimum of these values for s. 
Thus x, becomes non-basic with value zero, x, becomes basic with 
value 9, and other basic variables remain basic but their values change. 
The value of f decreases by the positive amount —@ x c/, or increases 
by minus this amount, which is the positive quantity 
—b’ 
oe pate (6) 
aw 
To produce the simplex tableau corresponding to the new b/f.s., we 
just have to arrange that the columns of I, correspond to the new 
basic variables, and that in the new e.c.c.s, c*, corresponding to 
x,, is zero. Having chosen s and 1, all this is achieved by the following 
operations: 


(@) Divide the s-th row of the tableau by ai, 
ic. a= a_/a,, j= 1,2,....n, 
b: > Bae (7) 
(note that a’, > 0, so b* = 0). 
® For i = 1,2,...,s—1, s+1,...,m add 
—a/, X (new s-th row) to i-th row, 


' ’ ; ; ; 
le. af =a,— a,x ay, j= 1,2,....n, i= 1,2,...miA¢s 


, 4 


ae , a, ‘ * , ' * 
= a, — 4,, X —; and b} Sore a. (8) 
as ay 
@) Add —c/ x (new s-th row) to the c-row, 
‘ ‘ 
P ‘iy BOLD stevens ay oe 1 4 d © , eT b. 9 
Le. ch=c, jee of ree M, anda* =a" —c;—. (9) 
a 
st J“ 


We shall refer to these as operations (1), Q), and @) of the simplex 
method. Referring back to section 3.1 we see that: 


(D effectively expresses the new basic variable x, in terms of the 
new set of non-basic variables, namely x,, x caigdlige ss 
pies Ree 

@) eliminates x, from the i-th equality constraint and introduces x. 
(i # s) so that the i-th equation now expresses the basic variable 
x, in terms of the new non-basic variables, 

G) eliminates x, from the expression for f and produces a linear 
combination of the new set of non-basic variables. 


m+? +t? 
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The new tableau has the following form: 


Be ree, ee ee ee Ee aes 0 
0 1 OT ERP eee ttre 
0 0 0 a’, 0 0 Qem+l Orne? I 
0 0 ae 0 I ae is eet 0 
O5)-c: & nekoeOrias. Gieken Piha 0 


and we may repeat the procedure until either all e.c.c.s are non-negative 
or we find a negative e.c.c. underneath a column of coefficients which 
are all non-positive. 


3.3. To Summarise 


The criterion for optimality is c, > 0,7 = 1, 2, ..., n; the criterion 
for feasible solutions unbounded below is c/<0 and a/ <0, 
i= 1, 2, ..., m, for any non-basic 1. 


It is worth looking back at the simplex procedure described in 
the previous section to check that it is no more difficult to carry 
out when the columns of I,, are in random columns of the tableau. 
The assumption that the first (or last) m columns are I,, is a very 
convenient one for descriptive and theoretical purposes, so we shall 
make this assumption frequently, but in practice, of course, one has 
to perform the appropriate operations whatever the positions of the 
columns of I,,, i.e. whichever variables are basic. 

The choice of min c; to identify t is not essential. Any negative 
e.c.c. can be used, but this choice is commonly used in practice. 

We shall refer to the element a’, as the pivot. 

The integers s and ¢ identify the basic variable that is to become 
non-basic and the non-basic variable that is to take its place. In terms 
of the feasible region R one stage of the simplex method consists 
of moving from one extreme point along an edge of R to an adjacent 
extreme point where f has a lower value (provided that b’ > 0). 


3.4 Example 
Maximise x, + 2x, + x, subject to 


2%; + Xy — Xo. 2, 
1 ed eS ee 
4X pA Xact pee Gn and * x45%5 BO: 
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In canonical form we have minimise (—1,-2,-1,0,0,0)x subject to 


Be mht hs veh dil, alt 2 
(2 = sg pig 0) x= (6). 
By XA. 4 thon Bh nO) 2 6 


Here ADL,, and cost coefficients corresponding to basic variables 
X4,X5,X,_ are zero, so we have the initial simplex tableau 


X, Xp, Xy Xy Xs Xe 0 
eS a a 2 4. = 
nome > O U8 6 
ES 6 6 
=—| —-2 =} 0 0 OT; f+0 

“p 


Here we have written the variable names above the columns and 
underlined the basic variables. This helps at first but should soon 
be unnecessary, whereas a consistent notation to define the steps 
of the method is strongly recommended. 

Thus 4 indicates the min c/, the @-column lists the ratios b//a/, 
for a), > 0 and < indicates the minimum of these values. The circle 
indicates the pivot. 

In this tableau ¢=2, s=1, x, is to become non-basic and x, is 
to become basic. 


Now we carry out the computational operations (), @, and @) 
and continue. 


ghana fog og ]!* 
G6 SF ol 13'S 
et 6 1) 4b 
EE Se ae ae 

“4 


a 31 
a 
i a 
0-2-2 
RT) AS 
ge 


In the third tableau all elements in the c-row are =0 so we have 
the optimum solution which, from the tableau, is x, = x,= x, =0 (the 
non-basic variables) and x,=4, x,=2, x,=0 (the basic variables). __ 

The optimum value is given by f+ 10=0, i.e. f,,,= —10; remember 
that the c-row is a shorthand for 


4 
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6x, + Ox, + Ox, + Yx, + 2x, + Ox, =f + 10. 

We notice in this case that the optimum solution, x = (0,4,2,0,0,0) ”, 
is degenerate, and also that in the second tableau there was a tie 
in the @-column. This tie indicates that the next b.f.s. will be degenerate, 
but we observe that this does not cause the simplex procedure to 
break down. We can also verify that if, in the second tableau, we 
choose s = 3 we get the same ‘solution provided by the alternative 
third tableau, but the tableau is different (ER). We can also verify 
that a different choice of ¢ in the first tableau also leads to the same 
solution (ER). 


3.5 
The crucial question now is whether or not we can be sure of 
obtaining the optimum solution in a finite number of simplex stages. 


Theorem 5 

If all bf.s.s of a Lp.p. are non-degenerate the simplex method 
terminates in a finite number of stages 

The proof of this important result is very simple. 

The feasible region R has a finite number of extreme points. If 
the simplex method is not finite then at some stage it must produce 
an extreme point (b/f.s.) already encountered. However, at any point, 
and in particular at each extreme point, the objective function has 
a unique value and this value decreases at each stage by the positive 
amount 

—ci x b'/a’. 
Thus the objective function values produced by the simplex method 
are strictly monotone decreasing and so any particular extreme point 
can only appear once as the current bf.s._— 

If a Lp.p. has degenerate b.f.s.s then the simplex method could 
possibly cycle round a sequence of identical degenerate extreme points 
because then we could have b’ = 0 and hence @ = 0. However, this 
need not happen as the examples in section 3.4 and exercise 4.1 
show. We return to the problem of cycling in chapter 4 where we 
also deal with the two remaining major problems A D I, and r(A) < m, 
neither of which requires the method we have already described to 
be changed. 
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3.6 

We can verify that there is a simplex tableau for any b.f.s. and 
in doing so we see the relationship between the original coefficients 
A, b, c, and the coefficients of the tableau A’, b’, c’. 

Let x be any b/f.s. and w.l.o.g. assume that x,,x,,...,x,, are 
the basic variables. Partition x into x, = (x,,x,,...,x,,)/ and 
X> = (Xn 1s %m429---sX,) > € Similarly, and A into (A,,A,), where A, 
is the first m columns of A, and A, is non-singular. 


Then A,x, + A,x,=b, so 
x, = A,'b-—A,'A,x, =0, 


and with x, = 0, x = is the b/s. 
2 


Referring back to the first tableau in section 3.2, 
ai,=(A,'A), and b/ =(A,'b),. 
Alternatively a, is the j-th column of the matrix A,'A. 
The objective function 
S@ =c"’x =clx, + 7x, 
=e, A,'b + (-c7A,'A, + ¢7)x, 
PF OL Sy hi FOL mae Fe, 
So this b.f.s. is optimal if the vector of non-basic e.c.c.s is non-negative, 
ie. -c, A,'A, +c, 20. 

This analysis does not give us an alternative method for solving 
l.p.p.s because in general we do not know which m variables will 
be basic in the optimum solution, and so we do not know which m 
columns of A to choose as A,.-It does tell us how each successive 
system of constraint equations and e.c.c.s is related to the original 
system, and we may notice that if AD I,, then at any stage the 
appropriate A,' will actually be present in the simplex tableau, 
occupying the columns of the tableau which were originally the columns 
Orr... 

For many people the simplex method in particular the tableau, 
has a mysterious air about it. It is true that it is still not completely 
clear why the simplex method works as well as it does in practice, 
but the reasons why the method works at all and the purpose of 
the operations at each stage present no difficulties if we remember 
that each row of the tableau represents an equation (see (1) and 
(2) of section 3.2) and if we bear in mind the interpretation of steps 
@®, @ and © of the simplex method given towards the ends of 
sections 3.1 and 3.2. 
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x By | 

There is another useful way of representing one stage of the simplex 
method. Operations (1) and @) consist of elementary row operations 
on the augmented matrix (A’,b’). Performing an elementary row 
operation on any mXn matrix A is equivalent to performing the 
operation on the unit matrix I,,, to obtain a matrix E say, and then 
pre-multiplying A by E. The matrix E is called an elementary matrix. 

For example E,A, where 


DO eee ke pets ale 
O51) feet. > 8 
E,= F and (E),, =A, 
l 
d 
| 
ees. 5s, SC? Coe 


is the matrix obtained by multiplying the s-th row of A by A, and 
the matrix E,A, where 


1 0. s2icud-ae do a8 
S ders ue eS 
E,=  }>. and (E,),, =A, 
| 
0 by. 0 
0 adi. 0; tacatta od 


is the matrix obtained by adding A x (s-th row of A) to the i-th row 
of A. 
Thus, referring to section 3.2, 
(A*,b*) = E,,... E,,,E,_,... E,E,E,(A’,b’), 
where Ain E,is 1/a’,, and 
Ain E, is —a,, i= 1,2,...,s—1, 541, ...,m. 

The special form of E, results in a simple form for the matrix 

product E,,E,,_, ... E,,,E,_,... E,E,, namely 


s+l 
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i #30) nh Alo = Bat. 10).= FAL 
Et = 1 a! ref Oi 0 
0 L/al, 0 ; 
0 TAeylal, 1 
0 Qs 42/45, 0 . 


where k denotes the number of the simplex stage (ER). 
In the example in section 3.4 


10 0 I 
Eft = ur 0), = te 
-10 1 0 - 


and the product E¥ E* is the matrix A, ' of section 3.4 for the final 
tableau. 
We note that the columns 1, 2 and 3 of A,', 


0 
Aj'= ( 0) =EFEs, 


l 
appear in columns 4, 5, and 6 of the final tableau because columns 
4, 5, and 6 of the original matrix A were columns |, 2, and 3 of I,. 
For the e.c.c.s e* we note that if e” is the m-vector 
(0,0, ...0,1,0...0), where (e,), = 1, then the n-vector e/ A’ is the s-th 
row of A’. 
Hence 


Sin al—al= 
-_- © © 
es 


nik al= alw 
Ni— aim ain 


et = ¢'" + (—c//a' eA’, 
and as (;. )e/ A’ is just the s-th row of A*, we have 
e*7 = ¢'" + (-c/e7) A*. 
Comparing the first and second tableaux of the example in section 
3.4,¢=2,c) =-2,s=1,a!, = 1, soe” =(1,0,0) and 
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"A* = (2,1,-1,1,0,0) (= e7A in this case). 
Thus the e.c.c.s in the second tableau, (3,0,-3,2,0,0), are 
(-1,-2,-1, 0,0,0) + 2(2,1,-1,1,0,0). 

We note in passing that our observations about row operations 
and elementary matrices apply in exactly the same way to column 
operations if we replace pre-multiplication by post-multiplication. Thus 
to perform an elementary column operation on A, we perform the 
operation on I, to produce the elementary matrix E, and then form 
the product AE. 

To illustrate this point, we consider the remaining elementary matrix 
operation, that of interchanging two rows (or columns). 

The matrix 


010 
E={1 0O 0 | isI, with the first and second rows 
0 9 1 
(or columns) interchanged, and for 
Gy, 4. Ay 
A=|4,, 4) 4, |, 


43, 43. 3, 


4, 4 4, Gin Ay 4h, 
EFA=[4a4,, @, 4; and AE=[a,, a,, 4a, 
43, 432 43, A, Ay, Ay, 


The product of several elementary matrices of this type is called 
a permutation matrix P. Each element of P is 0 or | and theze is 
precisely a single | in each row or column of P. Pre-multiplying 
a matrix A by P re-orders or permutes the rows of A; post-multiplying 
A by P permutes the columns of A. 


The simplex method was devised by G. Dantzig in 1947. The 
development of the simplex method in chapters 1, 2, 3 is more or 
less standard, and most texts follow this treatment. The basically 
similar developments available in {9}, {10} and {12} are useful further 
reading. 
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Exercises 3 
1. Solve the /.p.p.s in canonical form defined by the following data: 


oe ee Wee TR ee 5 
DACA. £6.23 4-3. 21 het 2 
1S SI a PS a 5 
” = (0,0,0,-3,8,-5). 
b cpio @ict: deel) Geir 9 
GAneis To 4 0 6 21, th=t oR 
Pegs HE 6 


c’ = (5,0,-9,0,0,-5). 

2. In the simplex method the decrease in f(x) at any stage is not 
necessarily the largest possible. Explain why, and explain what 
additional calculations are needed to achieve the largest possible 
decrease in f(x) at each simplex stage. (In practice the total number 
of stages needed is not reduced enough in general to make the 
extra calculations worthwhile.) 

3. Consider the /.p.p. 

minimise ex subjectto Ax =b, x,,,, ie Be 
i.e. there are k free variables x,,x,,...,x,. Explain how a Lp.p. 
in canonical form of size (m — k) x (n — k) could be obtained instead 
of one of size m x(n +k) that would result from the conversion 
technique described in section 2.1 (iv). 

4. In a certain lLp.p. x, is a free variable and is replaced by 
u, — V,, u,, ¥, = O, to obtain canonical form. Can u, and y, both 
be basic variables i in the same b/f.s.? 

5. In the simplex method explain why the optimality test, c, = @, 
j= 1,2,...,n, is a sufficient but not a necessary test. (Hint: consider 
degeneracy. ) 

6. Suppose that in a final (optimum) simplex tableau c, = 0 for 
some j where x, is non-basic. Explain how a different ‘optimum 
solution could usually be obtained, and explain why only usually. 
Hence state a simple sufficient criterion for deciding whether 
or not an optimum solution is unique, explain how all basic optimum 
solutions can be found and define, constructively, the set of all 
optimum solutions. 
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. What modifications to the simplex method as described are needed 


to solve directly problems of the form 
maximise c’x subjectto Ax = b,x => 0? 
Try them out on I(i) above with ¢” = (0,0,0,3,—8,5). 


. ALp.p. with a finite optimum solution is being solved by the 


simplex method and in one tableau, which is not optimum, a 
single basic variable is zero. Prove that this tableau cannot reappear 
at a later stage. 


. For the examples I(i) and I(ii) above, write down the matrices 


Ef. If the augmented matrix (A’,b’) of the optimum tableau is 

A, '(A,b), identify A,' in the optimum tableau and verify that 
(A’,b’),,, = A, "(A,D), ig 

and that A, ' is the product of the matrices E?. 


. Atany stage of the simplex method the columns of A corresponding 


to the current basic: variables are linearly independent. If the 
j-th column of A, a,,, is expressed as a linear combination of 
them, show that the coefficients of the linear combination are 


the elements of ‘., the j-th column of A’. 


NOTES 


THE SIMPLEX METHOD 
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NOTES 


CHAPTER 4 


THE SIMPLEX METHOD CONTINUED 


4.1 

The simplex method does not provide a guaranteed method for 
solving /.p.p.s in general until the three problems mentioned at the 
end of section 3.5 are dealt with. We will refer to them as the initial 
tableau, rank deficiency and cycling and discuss them in that order. 
Before doing so, some preliminary comments are worthwhile. We 
should expect that in general, after conversion to canonical form, 
A will have rank m and I, will not be present. If r(A) < m, then 
we cannot, by row operations on A, produce I, among the columns 
of A, since row operations do not change the rank. Again, if r(A) 
< m, then there are not m independent columns of A and so there 
are no basic solutions with m basic variables. This does not mean 
that XK is empty, just that basic solutions will only have k=r(A) 
basic variables. Rather than modify the simplex procedure to take 
account of this, it is preferable to remove linearly dependent rows 
and then continue with a matrix A that has full rank. However, if 
we have rank deficiency then we cannot have I,, contained in A 
So it is natural to deal with obtaining the initial tableau first. Note 
that if r(A)=m as expected, we cannot simply reduce m chosen 
(independent) columns to LI, by row operations since we have no 
way of knowing that the resulting vector b will be non-negative. It 
is possible that a b.f.s. is known even when A J I, but as this situation 
still requires almost as much work to produce the initial simplex 
tableau as the method we describe below, we ignore this possibility. 

Obtaining the initial tableau and dealing with rank deficiency requires 
consideration of a sequence of possible cases, and it is important 
to remember that we need a definite, algorithmic procedure. 


4.2 

We first mention a trivial, but very important, situation. 

Suppose, with the problem in canonical form, b = 0 and A D IL, 
the cost coefficients corresponding to the basic variables are non-zero. 
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Then, although (A,b) has the required form, c does not and we do 
not yet have a simplex tableau as defined in section 3.2. 
To produce an appropriate set of e.c.c.s we add 
—c, x (i-th row of A) toc ”, for i = 1,2,...,m, where the j-th column 
of A is the i-th column of I, i = 1,2,...,m. 
For example, in exercise 3.1(ii) where 
io ©1106 18 


(Ab)=13 1 4002] 2 


OE Pe a ee 
suppose the objective function were 

S(x) = e7x = (1,-2,1,1,1,-1)x, (1) 
then j, = 4, j,=2, j,=5 and (2) 

c’7 = c” — 1(1,0, 0,1,0,6) 

+ 2(3,1,-4,9,0,2) 

— 11,0, 2,0,1,2) 
= (5,0,-9,0,0,-5). (3) 
These are the objective function cost coefficients given in exercise 
3.1(ii), but notice that these row operations on A must also be applied 
to the “) column, so instead of f+ 0 in the bottom right-hand corner, 


we would start the simplex procedure with 


f+0+4+(-1I)X9+2x2+(-1I)x6@’=f- IL. (4) 
This particular part of the conversion to simplex tableau form is 
easily done using the tableau and can be indicated as follows: 
Plea rog VER? € 9 
SY oY Gaeigeat pe a 
1S 9 fog ng 2 6 
I 


| 
a oe oh, ye eon 0 
5 0-9 0 O -5}-I1 (5) 
If, for convenience, we assume that A = (A,,A,), A, =I, 
c’=(c/,e,) and c) 407, then 
e’7=c’ —c7A=(c{",c,"), where c,7=0.. (6) 


4.3 Obtaining the Initial 5.f.s. and Simplex Tableau when A 7 1, 

We assume the /.p.p. is in canonical form and for convenience 
in this section we assume r(A) = m. We also assume that no columns 
of I,, are present among the columns of A, leaving the intermediate 
case, when some but not all are present, to exercise 4.2. 
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Consider the two /.p.p.s: 
I minimise e"x subject to Ax = b, x = 0,, and 


II minimise (0",e")( ¥ ) subject to (A,I,,) (; }* b, (3 ) = 0,,,,. where 


e is the m-vector (1,1, ..., 1)”. 

Problem II is in canonical form and, if written as 

minimise ©" X subject to Ax = b, x = 0, 

we see that A D I,, and, using section 4.2, we easily obtain the 
initial simplex tableau and hence the optimum solution of prob- 
lem II. 

The variables z,,...,z,, are called artificial variables, and problem 
II is concerned with minimising their sum. Notice that f = ¢7x% = 0, 


and that problem II is feasible because :) = : is the b/f.s. of 
the initial tableau for problem II, so problem II has a finite optimum 
value f,,,, and f,,,=0. (Remember b=0, see section 2.2 and the 
end of section 2.9.) 


(i) If Fess > 0 then problem I has no feasible solutions. For suppose 
x, is a feasible solution for problem I, then x, = v is a feasible 


solution for problem II with value zero. By the same argument, 
if problem II has any feasible solutions with value zero, then 
problem I has a feasible solution. 

(ii) If , = 0 then problem | is feasible, but we do not necessarily 
have a b/f.s. of problem I. There are two possibilities: either 
all the basic variables in the optimum solution of II are among 
X,, X,,...,X,, Or some basic variables are artificial variables with 
value zero. Remember that whatever set of e.c.c.s we produce 
when solving problem II we are still minimising =)", z,, so if 
fF, = 9 then z = 0 in the optimum solution. 

Leaving the second possibility to (iii) below, we consider the 
first possibility, which we expect to be the case in general. Since 
all the z, are non-basic, we can assume w./.o.g. that the basic 
variables are x,,x,,...,x,, and that I, is in the first m columns 
of the optimum tableau. This tableau has the following form: 
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(iii) 
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and to obtain an initial simplex tableau for the original problem, 
problem I, we simply remove the columns corresponding to 
Z,5..+Z» replace the coefficients 0, ...,0, €),,,, ...,&/ in the c-row 
by the original cost coefficients ¢’, and convert this new c-row 
to the form 0, ...,0,c/,,,,...,¢/ as described in section 4.2. 

It is convenient to call problem II the artificial problem, and 
solving it, part I. Solving the original problem, starting with the 
initial simplex tableau with coefficients A’, b’, c’, is called part 
II. Together they constitute the two-part simplex method for 
solving any I.p.p. (in canonical form). (The name two-stage simplex 
method is sometimes used, but we shall use ‘stage’ to refer to 
one of the steps of the simplex method as described in section 
3.2.) A simple example to illustrate the two-part method is given 
in section 4.4 below. 

Notice that if the matrix A of the original problem (prob- 
lem I) is partitioned 

A = (A,,A,), where A, is mx m, 
then Aj in (1) must be A,', b’ = A, 'band Aj = A,'A,. 

It is worth recalling the remarks near the ends of the sections 

2.9 and 4.1. Just as, to solve a /.p.p. we need to solve a particular 
m X m system of equations and the simplex method is in effect 
an efficient way of finding that system, so, to find an initial 
simplex tableau we need to reduce m columns of A to I, by 
row operations and the artificial problem provides an efficient 
way of choosing an appropriate set of m columns. 
If Figs = 0, but the basic variables in X,,,, include some artificial 
variables, then, supposing w./.o.g. that the basic variables are 
Xpy Xgy cer Xp Zaps Ze429-+->Z—» the final tableau for problem II 
has the form below, where the c-row, which is to be discarded, 
has been omitted: 


ol ded ike) 1 RR? id MS 
bi 
B, O : 
1 b, | (2) 
0 
Oo B, 
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The matrix B, is k x(n—k), B, is k xk, B, is (m—k)x(n—k) 
and B, is (m—k) xk. 

To obtain the initial tableau for problem I we just perform 
appropriate row operations to produce the missing columns of 


I,, in the - part of the tableau. These operations are just” 
3 


operations (1) and (Q) of the simplex method, using any non-zero 
element of B, as pivot. For example, suppose (B,),, 4 0, ((B,),, 
= 4.1441). Then operations @) and @) will produce e,,, in 
the (k + 1)-th column (and remove e, , , from the (k + 1)-th artificial 
column), x,,, will become basic and z,,, non-basic. This effec- 
tively reduces the size of B, by one row and one column, and 
we continue until B, disappears completely, at which time the 
m columns of I,, are present in the first n columns of the tableau 
and we have the situation discussed in (ii) above. A non-zero 
element can always be found in B, (ER). Notice that the b-column 
is unchanged because b, ,, = bj ,, =... = b’, = 0. The m artificial 
columns may be removed. There are sometimes reasons for leaving 
them, in which case they provide a record, as in (ii) above, of 
the operations which have transformed the system from (A,b) 
to (A’,b’), where A’ D I. 


4.4 Example 
Solve the /.p.p. 
minimise x, + 5x, + 2x, + 2x, + 7x, subject to 


l I Past ,O\e,_ f2 
(1 - i Waa) 1) x= (3). x20. 
No columns of I, are present among the columns of A so we use 
the two-part method, with artificial variables z, and z,. The artificial 


problem is minimise z,+z, = f(x) = (0,0,0,0,0,1,1) . subject to 


(' bt at “C) (*) (*) 
= ; => 0. 
ae 0 =! 0 1 Zz 5 Zz 
The initial tableau (for the artificial problem) and the simplex calcula- 
tions now follow: 


48 LINEAR PROGRAMMING AND APPLICATIONS §4.4 


l (0) 1 a 0 l 0 2 2<€ 
m 2 ia I ae 5 “ip (1) 
oe 0 0 0 0 l l J » 
at ag l 1 0 0.1 fi-7 
i 
1 l 1 =1 0 I 0 3 
se Dake @) <li se I I + € @Q) 
Pina (O41 + &) sd aBort ,, eoforn lndiadupis Mt 
dh 
Soars ata Agr toa th te. Radke os 
SONERE  US MBAs pels Be pes Ona Be (3) 
0 0 0 0 0 I 1 f 


At this stage ¢’ = 0, so we have f,,,; f,,, = 0 so the feasible region 
R for the original problem is not empty. 
The initial b.f.s. for the original problem is provided by the tableau 
(3) and is 
x, = 4, x, =F, X= xy = x, = 0 
and the equivalent system of equations A’x = b’, with I, C A’ is 


i l J 0 ed 
A’=| - ; a) v= ( ) - (4) 
( oe 


To obtain the initial simplex tableau for the original problem, i.e. 
the initial tableau of part II, we remove the artificial columns, replace 
the c-row by (1,5,2,2,7) and convert the cost coefficients to e.c.c.s 
with c, = c, = 0. 


Ni= Niw 


ODD eee 4 . oe 
241 OP) agit G4 ; (5) 
l 5 2. Am y 0 
G ap6 Qin eorkee ae d 
ay 
] 2 l 0 -!il 5 
0 1 0 1 a | 3 - (6) 


0 bud (nsilionQorhthaladal 
Here c’ = 0, so we have the final or optimum tableau for the original 
problem. The optimum solution is x,,, = (5,0,0,3,0,)” and the optimum 
value is f,,,= 11. 
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4.5 Rank Deficiency, or Redundancy and Inconsistency. 

If r(A) < m then A JI, so we must necessarily use the two-part 
simplex method and we assume, for convenience, that the artificial 
problem is problem II in section 4.3. 


(i) If ri = 0, then the original problem is feasible, but as r(A) 4m 
we must have an optimum tableau with the form illustrated in 
section 4.3 (iii) which, in this instance, we write as follows, where 
bi = 0 and b} = 0: 


0) 


In practice we will not know that r(A)<m. So we proceed 
as in section 4.3 (iii), pivoting on non-zero elements of Bis jat 
r(A)<m, at some stage the matrix B, will be zero, in which 
case the last (m—k) rows can be discarded and we have the 
Situation discussed in 4.3(ii), with m = k. We will thus have removed 
the (m — k) redundant equations and obtained an initial simplex 
tableau. We will have performed not much more work than a 
conventional reduction of A to row-echelon form to determine 
the redundant equations, which would only take us to the beginning 
of the first part. 


(ii) If ds > 0, then there are no feasible solutions to the original 
problem, so there is no /.p.p. to solve. However, for a real problem 
this would probably be rather disturbing and we would wish to 
determine whether the given equality constraints were consistent 
(there are solutions but none non-negative) or inconsistent (there 
are no solutions at all), because this would help to determine 
the error in the model formulation. 

The optimum tableau for problem I will be as (1) above, but 
now b,;=0 and b}=0 and b} #0 because /,,, > 0 (zi = 0 and 
z, # 0). Again we proceed as in section 4.3 (ili), pivoting on 
any non-zero element of B, and reducing the size of B, until 
B, is zero. If at that stage b} = 0 then the original constraints 
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are consistent, but some are redundant. If b5 40 then the original 
constraints are inconsistent; there are no vectors x such that Ax = b 
(ER). 


4.6 

A comment is appropriate at this stage before discussing cycling. 
The analysis in section 4.5, along with everything that has preceded 
it, tacitly assumes that the arithmetic in operations (1), @), G) of 
the simplex method will be performed exactly. This will be the case 
in academic exercises and in certain spécial problems, but in general, 
when /.p.p.s are solved on a computer, the arithmetic is not performed 
exactly and so, for example, B, and bé in (1) of section 4.5 will 
almost never be exactly zero unless they are empty throughout. A 
proper investigation of the consequences of this observation is a serious 
undertaking in the mathematical field of numerical analysis, but it 
leads to technical modifications in the details of the arithmetic 
operations rather than in the simplex method itself. (These modifica- 
tions are described and discussed briefly in Appendix 3). This observa- 
tion about arithmetic operations does mean, however, that section 
4.5 is somewhat unrealistic for general /.p.p.s, unless all coefficients 
with magnitude less than some small number € are regarded as zero. 
If the coefficients of A and b were given to four decimal places 
for example, then we might choose «= 0.5 x 10 *. 


4.7 Cycling 

This section can also be regarded as unrealistic. In practice only 
specially constructed l.p.p.s will cycle and the modification which 
we derive below, to the method of section 3.2(ii) for choosing s, 
is not incorporated in computer programs for solving /.p.p.s. Never- 
theless it is a worthwhile excursion because the modification has 
a certain charm and it removes the qualifications about degenerate 
b.f.s.s in theorem 5 (section 3.5). 

Cycling cannot occur if all b.f.s.s which the simplex method pro- 
duces are non-degenerate, and if the current b.f.s. is non-degenerate 
the next one produced is degenerate if and only if there is a tie 
in the @-column of the tableau. So we perturb the original problem 
in such a way that a tie cannot occur. For convenience we assume 
that A > I, so that we do not need the artificial first part. 
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Now replace b by b(e), where 
b(c) =b+27_, ea, 
and € is some small positive number. 
In other words 


b(€) = b, + €a,, + €’a,, + €'a,, +... + CG.4 t=... 


For € sufficiently small a tie cannot occur in the Aeohion, aR 
entries are now 


ble 
A ) bora, > 4) t= 1,2;..;. 
ay 
b b b a a 
For, suppose —- = — = min —. Then if “2 > 4! then 
a, a,, fav Gy a, 20 
b (e€) bi (e ‘ coe a a 
i> ) for € sufficiently small, and conversely if < 4, 
ay, 2 Re a), 
ay, a, 412 
If —— = —.,, then we consider the e€? terms, =!2 and = — .If these 
a, le a,, ai, a), 


do not resolve the tie we consider the €* terms and so on. For some 


a 
j, i<j <n, we must have —“ # a (ER). The argument clearly 
le 2 
applies to any tableau, not just the first, and to a tie between any 


number of rows. 

We may now observe that nowhere do we actually need the value 
of «, so that even if we wanted to incorporate this perturbation 
modification to the simplex method, which we do not, we would 
not actually have to perturb the problem at all. 

To demonstrate, we apply the modification to the example discussed 
in section 3.4. 

The second tableau was 


al i = Db 0 ee b(e) i 
es -l £ Che t4+2e¢e~+ C+ 
+2 4.4 £ OF 8 1218 + 4 + de” 4 e* + €* 
20 2-10 1] 4 [2/442 + 2e°—¢* ft 
30 43,2 0 OT /+4 
4 


where we have added the explicit description of b(e), although this 
is unnecessary as the coefficients involved appear in the main part 
of the tableau. 
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In this case ¢t = 3 and 


b, by, re 

—ae n — 

Gy, 433 t=1,2.3 14, 
aj,,>0 


The € terms give * = 3, which is still a tie. 
The e€” terms give 0 = 0, which is still a tie. 
The e° terms give * = 3, which is still a tie. 


b,(e) 6b, (e) b(e) b,(e) 
4 ee ee Z > i ; == 
The €° terms give ; > —3, so a>, Ss a5 °o ig i a iss 


for € sufficiently small. 
Thus in this case we choose s = 3. ; 

It should be emphasised that degenerate b.f.s.s do not necessarily 
lead to cycling. 

An alternative approach for resolving cycling may be found in {10}. 
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Exercises 4 


1. Solve the /.p.p. in canonical form defined by the following data: 


1 to QroQrloodans 4\ 
a=(0 isd al i). b= (3). "= .ns.nn, 
ht -aille Se maes 6 


Comment on the final tableau and optimum solution and obtain 
a different final tableau. 

2. Suppose that the columns of A in a L.p.p. in canonical form include 
some, but not all, of the columns of I,,. What is the artificial 
problem which is solved in the first part of the two-part simplex 
method? 

(Hint: for convenience, assume that the last k columns of A are 
the first k columns of I.) 

3. Solve the /.p.p.s below by the two-part simplex method. For the 
example in which f(x) = e’x is unbounded (in the appropriate 
sense) find a feasible solution with value 1000. 

(i) minimise x, — 2x, + 3x, subject to 


44 })x=(?).x=0. 


(ii) maximise 3x, — x, — 3x, + x, subject to 


ro 2 =f 4 0 
2 + Siew i9i,xz0. 
1 -) 2 -1 6 
(iii) maximise —x, + 2x, — x, subject to 
(7 9)x=(§). x20. 


4. Using artificial variables find a non-negative solution of the system 
of equations 


x —X, t+ x, =3 
Gs, — 32, — x, =7 
3x, — 2%, ~ a, 24. 


5. For the following systems of constraints, use artificial variables 
to find an initial simplex tableau (without the c-row) or to show 
that an initial simplex tableau cannot be found: 


(i) | -1 1 Prix tsh, += 6: 
1 -| I 7 

v(3 

(il) = 


AN= We = 
| 
rNo-— 
R4 
ek) 
Sensi 
~*~ 
ll 
PER 
Onn 
eee” 
* 
IV 
= 
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6. When the /.p.p. 
minimise ¢’ x subject to Ax = b, x = 0 (where b = 0) 


is converted to canonical form, the columns of the matrix of 
coefficients (A, —I,,) include no columns of I, in general. Devise 
a simple set of (m-— 1) row operations as a result of which only 
one artificial variable (instead of m) needs to be introduced in 
the first stage of the two-part simplex method. 

7. In the € perturbation method for avoiding cycling, if the initial 
vector b has some zero elements, it is possible that the b/s. 
defined by b(e) is not strictly feasible; for example, suppose 
b, = 0 and a,,<0. This conflicts with the requirements for the 
simplex method, but is easily avoided. Explain how. 


NOTES 
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NOTES 


CHAPTER 5 


DUALITY: THE DUALITY THEOREM AND 
CONSEQUENCES 


5.1 

We introduce this important, and quite unexpected, development 
by recalling the dietician whom we met in section 1.2. We can now 
imagine him happily working out an optimum diet (using the simplex 
method). But he is interrupted by the arrival of another character 
in this gastronomic drama, the nutrient tablets salesman. The salesman 
says that he can supply, at a price, all the nutrients in tablet form 
and he suggests that the dietician take advantage of this. The dietician 
will not be interested in this offer if the cost of so buying the component 
nutrients in one unit of the j-th food is greater than c,. So the salesman’s 
prices y,, i= 1,2,...,m, of units of i-th nutrient must satisfy 

Ya, € V,G, +... FY 4, =, J=t,2,..., R- (1) 

In other words, the costs of artificial foods as provided by the 
salesman must not exceed the costs of the natural foods available 
to the dietician. 

However, the salesman wishes his total income from any deal with 
the dietician to be as large as possible, so that subject to (1) 


and y,20, é=1,2,...,m, (2) 
he wishes to maximize y,b, + y,b, +... + y,,b,,. (3) 

Thus the salesman’s problem is another /.p.p., namely 
maximise y'b subjectto y’A<c', y=0. (4) 


This /.p.p. (4) is said to be the dual problem of the primal problem 
of section 1.2. It would seem natural to rewrite (4) transposed, 


ie. maximise b’y subjectto A’y<c, y=0, (5) 


but the form given in (4) is generally preferable. 

At present the connection between the /.p.p.s (1) of section 1.2 
and (4) is the purely formal one that they are defined in terms of 
the same data, but we will establish in this chapter profound general 
connections between the dietician’s and the salesman’s problems. 

In fact for any /.p.p. there is a corresponding dual problem and 
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we now formally define them for /.p.p.s in canonical and standard 
forms. 


5.2 

(i) Canonical Form 

Primal: minimise c’x subject to Ax =b, x =0, (1) 
Dual: maximise y’b subjectto y’A<c" (2) 
(ii) Standard Form 

Primal: — minimise e’x subjectto Ax=b,x=9, (3) 
Dual: maximise y'b subjectto y’A<c’,y=0. (4) 


Notice that in both cases one /.p.p. is a minimisation problem and 
the other a maximisation problem, but whereas in (ii) both ‘sets of 
constraints are inequalities in non-negative variables, in (i) the primal 
constraints are equations in non-negative variables and the dual 
constraints are inequalities in free variables. For this reason (ii) is 
called the symmetric form of the primal and dual /.p.p.s and (i) the 
unsymmetric form. 

Since any /.p.p. can be converted to, say, canonical (primal) form 
two questions arise immediately: are (i) and (ii) equivalent, and what 
is the dual of the dual? These questions are answered by the following 
theorems. 


Theorem 6 

The canonical (unsymmetric) and standard (symmetric) forms of 
primal and dual /.p.p.s are equivalentm 

We convert the standard primal to canonical primal form. 


Introducing surplus variables z,, ... , z,, gives 
minimise (c’, 0") ; ) subject to (A, — L(y ) =b, le ) = 0, 
i.e. minimise @'X subjectto Ax=b, Xx =0, (5) 


where @’ = (c’,0’) etc. 
The /.p.p. (5) is precisely in canonical primal form, so its dual, 
by (i), is 
maximise ¥'b subjectto ¥'A < @’. (6) 
Substituting for b, € etc., we obtain 
maximise ¥'’b subjectto ¥"(A, —I,,) = (c’,0’). (7) 
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Now we put y = y and (7) becomes 
maximise y'b subjectto y'A<c’, -y’<0', 
ie. maximise y'b subjectto y’'A<c', y=0. (8) 
The /.p.p (8) is exactly the standard dual defined in (ii) and hence 
the two definitions (i) and (ii) are really the sameg 


Theorem 7 
The dual of the dual is the primalg 
By theorem 6 we can consider either form. 
We convert the canonical dual to canonical primal form as follows: 


maximise y'b subject to y'A<c’, (9) 
i.e. maximise (u—v)'b subjectto (u—v)'A<c’, uv=0, 


i.e. maximise wv (4 ) subjectto (u’, v(4A) =e’, 


S30 ey 
i.e. maximise (u’,v w)( 4) subject to 


i.e. minimise (u’,v’,w’) 


, u\’/-b 
1.€. minimise |v b subject to 
w 0 


(:)(A)-°.(9 
Vv —Aj=c,i{ivj}2o, 
Ww I, Ww 
i.e. minimise co.6",0( ) 
u 
cee wo 
w 


(A’, -A’.ty 
i.e. minimise ¢@’X subjectto Ax =b, x =0, (11) 


60 LINEAR PROGRAMMING AND APPLICATIONS §5.3 


—b u = = 
where ¢ = (*’) x= (*), A =(A’,-A’, Ly D =e) 


w 


The L.p.p (11) is precisely in canonical primal form, so its dual 
is defined by (i) to be 
maximise ¥"b subjectto j’A <@’, (12) 
ie. maximise y¥'c subject to ¥’(A’,-A’,1,) < (-b’,b’,07), 
i.e. minimise (-¥)'c subject to 
(-9)"(A", -A”, I) = (67, —b”, 07). : 

Now we put -y = x because y must be an n-vector and we obtain 
the /.p.p. 
minimise x"e subjectto x"™A’>b’, x’(-A’) = -b’, x’ = 07, 

i.e. minimise c’x subjectto Ax =>b, Ax <b,x=0, 
i.e. minimise c’x subjectto Ax =b, x =0, 
which is the canonical primalg 

The proofs of theorems 6 and 7 demonstrate how to obtain the 
dual of any Lp.p. First convert it to any one of the four forms in 
(i) and (ii) and then use the appropriate definition (i) or (ii). It is 
not necessary to be quite as tedious as we have been in the proofs 
above, but systematic conversion of variables, constraints, and objec- 
tive function is helpful, and the renaming process of (11) avoids any 
confusion that might be caused by quantities having the ‘‘wrong name’’, 
e.g. y for the variables in primal form. Any /.p.p. and its dual problem 
should together be regarded as a single entity. Given any such pair 
of problems it is not, strictly speaking, the case that one is the primal 
and one the dual; each is the dual of the other. 

However, in practice, as we convert any /.p.p. to canonical (primal) 
form to solve it, it is convenient, particularly for theoretical purposes, 
to refer to this form of the given problem simply as the primal, 
and to the dual /.p.p. of the problem in this form simply as the 
dual. 

We will also find it convenient to call the objective function of 
the dual g(y), or simply g, thus g(y) = y’b. 


5.3 

The situation of the dietician and the salesman suggests, correctly, 
a profound connection between a /.p.p. and its dual problem. We 
now solve a simple example of a /.p.p. and its dual because this 
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will suggest, again correctly, more connections and will be useful 
for explaining some notation which we will need to introduce. 


Primal problem: 
minimise f(x) = 6x, + 8x, + 7x, + x, + 15x, 
subjectto x, +X, Ce See See (1) 
he FX, — Ry XH 1S, cR proiiatylen 
We introduce two surplus variables, x, and x,, to convert to canonical 
form and since we have I, C A we have the initial simplex tableau 
l 0 1 0 3 0} 6 
0 | 1 -l -l} 5 (2) 


a 

11 0 

rae OSs 21.5 7% eB SY 
Bt litera Sly 176 


r 
! 
' 
' 
Ss 
' 
' 
' 
' 
‘ 


0 0 -7 9 -l 


= : a 


‘ie ae ces ° 
i 9 
“3. Fee | ae Oh a ee 
4 — 
2 50 4 OME BD ~ i. 3g 


where x, and x, are the basic variables at optimality, with values 
> and } respectively. 
Dual problem: this involves only two variables so we can easily solve 
it graphically. It is 
maximise 6y, + Sy,, subject to 
¥,= 6 
y.= 8 
Yty,= 7 (4) 
y,=s | 
3y, + y, = 15 


CRORES 


and y,,y,= 0. 


The optimum value of g(y) occurs at the point A, which is the 
point y, = 4, y, = 3 and the intersection of constraint boundaries @) 
and 6), and g,,,=4 x6+3x5=39, the same as the optimum value 
of the primal. 

Comparing the c-row in the initial and optimum tableaux we see 
that the overall row multipliers are —4 and —3. The only way to 
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produce cj = 2 and cj = 5 is to add toc ~4x (Ist row of A) and 
—3 x (2nd row of A), although this was done in four row-operations 
above. We also notice that c{ = 4, c} = 3 in (3) above, corresponding 
to —L, in columns 6 and 7 of A. 


The simplex operations to get from tableau (2) to tableau (3) are 
1 


3 . 

2 

The columns of I, in the final tableau are (in order) column 5 
and column 3, and the corresponding columns of A provide the 
2x2 matrix B, 

3 1 a mi ; 
al % and (i Dit (5) 
2 2 

Of course B' must be present in the final tableau in columns | and 
2, because these columns of A were I,. 


Before producing y,,, using B', we might as well observe that 
in (3), following section 3.7, 


Ni=N|=— 


equivalent to pre-multiplication of (A, b) by ( 


c’7 = ¢” — 4e,"A — 3e,’A (6) 
=c¢’ — (4, 3)A, 
1 +y + 0 
and B!=E*E* = ir 3)( eae ) (ER), (7) 
2 | 


Now consider the 2-vector, ¢ say, of original cost coefficients 
corresponding to columns of I, (in order) in the final tableau. 


We have ¢ = (e) = (7 ) and 


i ! 
¢7B' = (15,7) ( 1 a) = (4,3) = yo 

2 2 
Before we prove that not one of these observations is a coincidence, 
we observe that since this example is given in standard form, but 
solved in canonical form, it effectively illustrates both forms of the 
primal and dual /.p.p.s. 

Note that the use of the notation ¢ in the context of the solution 

of the dual problem is unrelated to its use elsewhere, e.g. (5) and 
(11) section 5.2 and problem II section 4.3. 


5.4 Theorem 8. The Duality Theorem 
For any /.p.p. and its dual problem, if either problem has an optimum 
solution then so does the other and the optimum values are equalm 
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Corollary If either problem has unbounded feasible solutions* then 
the other has no feasible solutions 

(*This is a convenient, but rather loose, abbreviation for feasible 
solutions, the values of which are unbounded in the appropriate sense.) 

By theorems 6 and 7, to establish the duality theorem it is sufficient 
to prove that if the canonical primal problem has an optimum solution, 
then so has the dual and the optimum values are the same. 

The duality theorem is by far the most important result in linear 
programming. It establishes a second dimension to the general theory 
and plays a prominent part in almost all applications and special 
methods and it, or its equivalent, appears in some quite unexpected 
places. Partly because of its central importance, and partly because 
each has its own particular interest, we shall provide three different 
proofs. 

The first two use information from the simplex method, and are 
constructive proofs in that they provide an explicit expression for 
the solution of the dual problem. The third, by contrast, is more 
abstract and analytic in nature and appears in Appendix 2. It does 
not rely on the simplex method, but establishes the required result 
without providing a method for obtaining the solution of the dual. 
Of course, the dual problem could be solved by first converting to 
canonical primal form and then using the simplex method, but as 
we shall see, whenever the primal is solved the solution of the dual 
is obtainable immediately. 


First Proof of the Duality Theorem 

We assume that the problem is in canonical primal form and that 
A DI,,, so in practice this will usually be at the end of the first 
part of the two-part simplex method. Previous to this Stage the tableaux 
have been concerned with the artificial problem, and only at the 
end of the first part do we obtain the initial simplex tableau for 
the original problem. For convenience we will denote the constraint 
equations at this stage by Ax = b, and we denote the first m columns 
of A by B. 

At any stage of the simplex method, let /,, j,, ..., Jj, be the column 
indices of the columns of I,,, and denote by ¢ the m-vector whose 


i-th element is the j-th (original) cost coefficient, 

LG, B46 (6,5 Opn 4G) > 
Now for j = 1,2, ...,1 denote the scalar product of this vector with 
the j-th column of constraint coefficients by wi, 
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i.e. w= ZX, ©),4, = Di, ¢, ay. (1) 
Thus w’’ is the n-vector ¢’A’, the dashes indicating a general stage 
(see section 3.2). 
So, in the tableau (2) in the previous section, for example 
j, = 1,j, = 2, € = (6, 8)’, w, =6, w, = 8, w, = 14,.... 
In the tableau (3) of the previous section, for example 
A=5, 2 =3, a, =-3, c, =5, 
€ =(c,,c,)” = (15,7), w, = 4, w, = 3, w, = 7, ... 


Lemma 
At any stage of the simplex method 
w; = c, — c/ = j-th original cost coefficient —j-th e.c.c. — 

We establish this result by induction. 

Assume for convenience and w./.o.g. that the (n — m + 7)-thcolumn 
of A is the i-th column of I, for i= 1, 2, ..., m. (Thus A has the 
form (B, ..., 1,,).) Then the e.c.c.s. ¢’ in the first simplex tableau 
(of the second stage) are defined by 

ce =¢'- rT" Cc x i-th row of A 
LG OE By Coy Gia JF Vy oa san Oy ee 
j, =n-—mt+i,j, =n-—m+2,...,j3,, =n 
Also € = (C,_m4ts+:->Cm), SO that w, = 27", (€),a, = c,—c¢;. Thus 


n—m+é 


j 
the assertion of the lemma is true for the initial simplex tableau. 


Now, using the notation established in section 3.2, and assuming 
w.l.o.g. that the first m columns of A’ are I,,, we show that 
wi =c,—c,;, j=1,2,...,n, implies 
af eal Fie FE Bale PY Meare? 
It is helpful to refer back to the two tableaux pictured in section 
3.2, and to recall that 
ee at ae 
af = 4, — 4, X aylays J=1,2,..:50) 1 = 1, 2p, mypbihis; 
gad. cf) wc, -¢) % 6, / a fe 2.0% 
Now, w; =a),c, + a3,€,+ ... t+ ajc,+...+a),,¢,,, because in 
CHES CASE “Cs, Des cess Fd Lis Sy aony MOTs 
and w; = c, — c; by hypothesis. 
Also, w) = af, ¢, + @f, ¢; +... #' det, +... + 2c, because 


now Chime Be Pa ad (h5'2, <2, by s.05 Mn 
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Therefore w¥ = afc, + a¥c, +... + 0c, +... + Gog Cin - 
’ , 
So , A 
= («,- - “,) oF... + («, elesesat 0m) 
st ay 
, 
Uy A , 
Fe oy thie “% a.,] ae eo 
/ % , 
m , sf m , sf 
= 27-1 4 Ch te Zn 25, ¢, # — ©; 
at st 
/ y 
a 
Bu sg 2 AUN 
Paci at , (W,— C)=¢, c ah c,) 
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, 
oe ie 
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a 


, 
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Using this result, the proof of the duality theorem can now be completed 


without difficulty. 


Denote the optimum solution by X,,, and assume that in the final 
(optimum) tableau the columns of I, in order are in the first m columns 
of the tableau. Thus €7,, = (c,,¢,,...,¢,,), and c’,, = 0. Since the 
first m columns of A were initially the matrix B, the simplex operations 


overall are equivalent to pre-multiplication by B'' 
i.e. (x,,,), = (B'b),, i = 1,2, ...,m, (x 


op) = 0, = m+1,...,n. 


Now consider the m-vector y,, where 
yo =¢'p': 
This vector is feasible for the dual problem because 


yeA= 
(yo A), 


, 
Ww, 


e, 


¢’B''A, and so 
~T .», ° 
ca), at the optimum stage 


at the optimum stage 


Thus (ygA), —¢,=—¢ = 0, 
ae ey 


J 


The optimum value of f, f(x 


opt 


, 


at the optimum stage. 


x " 

JF eX,,, = Tj, (x 
= 27, ¢,(x,,,), in this case 

= 27, ¢,b/ at the optimum stage 


“= > yell c,(B'b),, 


opt ), 


and the value of the objective function of the dual 
8 (Yo) = =", (¥o),b, = U7, (€'B'),b,. 


pit. Saks 
, so bi, = B 


'b, 


(2) 


(3) 


(4) 
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Therefore f(x,,,) = 8(Yo)- 
Now, for any feasible solutions of the primal and dual problems 
x and y, 
g(y) = yb = y’(Ax) = (y’A)x <e’x = f(x). (5) 
Hence y, is in fact an optimum solution of the dualg 


Proof of Corollary 

Suppose the assertion is false, so that for any K there is a feasible 
solution x, of the primal such that /(x,) < K, and suppose that the 
dual problem has a feasible solution y. Then just choose K = g(y), 
and as we have just seen 


&(Y) =S (Xx), 
which is a contradiction. 
It may be helpful to imagine an objective function axis on which 
g(y) and f(x) can be indicated for any feasible y and x. 


g(y) I(x) 


‘a 


The primal objective function values are all to the right of any 
dual objective function values, from (5). These simple considerations 
make it clear that, apart from the vital assertion that equality must 
occur for optimum solutions, the duality theorem can be established 
from elementary considerations without the aid of the simplex method. 


5.5 Consequences of the Duality Theorem 

This and the following two sections are a collection of observations 
and further results that can be regarded as immediate consequences 
of theorem 8 itself and of the first proof of theorem 8. The three 
sections are labelled practical, theoretical and economic consequences, 
but these are not meant to be precise labels and it is in no sense 
an exhaustive collection. In fact, the duality theorem plays a prominent 
role in almost everything that follows. This chapter is completed with 
a description of the primal-dual algorithm which is a variant of the 
simplex method using the duality theorem directly. 


Practical Consequences 
It is very important to realise that the optimum solution of the 
dual is provided by the simplex method when solving the primal. 
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Referring back to (1) of the previous section and the lemma, we 
see that the elements of y,,, are elements of w/,, and since 
w) = c, — c; we obtain elements of the optimum dual solution by 
subtracting appropriate e.c.c.s. Of the final simplex stage from original 
cost coefficients. Baw Voor is given by 

rie x (i-th column of B'), 

and as the initial simplex tableau (at the beginning of the second 
part) contains the columns of I, the columns of B'' are in the 
corresponding columns of the final simplex tableau. So, suppose 
initially the i-th column of I, is the /-th column of A, i = 1, 2,...,m, 
then 

ope = & — &,» (1) 
where the e.c.c.s are from the final stage. 


Example 1 
Consider the /.p.p. solved in section 3.4, Here, for the problem 
as given, |, C A so 
J, = 4, 4, = 5, j, = 6, c, = 0, c, = 0, Cy = 0), 
and from the final simplex tableau ci = ‘', cf =2, cf = 0, 
SO Y,,, = (—4 1-4) 0)’. 
Note that 
7 it 3 
a = (-7, =2, +4, si. a 0), 
which is = (-1, -2, -1, 0, 0, 0) =e’, (2) 
and 8(Yop:) = Yop = -F — 2 = -10 =c7x,,, = f(X,p,)- 


Example 2 
Consider the /.p.p. solved in section 4.4. Referring to tableau (5) 
presented there, 


J, = 2, Jj, =4, c, = 5, c, = 2, 
and from the final tableau c) = 1, c{ = 0, 
$0 Yun, = (4,2) 

Again we note that 


4 

Yond = (4, 2) CG 

2 

which is < (1, 5, 2, 2, 7), 


c= 
Nie tle 
sate ies 
nee” 
Il 
-_~ 
- 
> 
nN 
Ae 
i 


l 


5 
andy” b= (4, 2(i) = Il =e"x,,,. (3) 
2 
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As these two examples demonstrate, the duality theorem provides 
a foolproof check or verification. For any /.p.p. and its dual, if x, 
and y, are believed to be optimum, then if they satisfy their respective 
constraints and f(x,) = g(y,) they are indeed optimum. When using 
this means of checking a calculated solution x, it is vital to remember 
that y, must satisfy the dual constraints. For example y’ = (3, 7), 
in example 2 above, gives y’b = 11 but (3, 7)’ is not an optimum 
solution of the dual because it does not satisfy the dual constraints. 

It is equally vital to realise, in example 2 above, that the vector 
Y.p», Obtained is not the solution of the dual of the p.p. as given 
at the beginning of section 4.4, but only of the /.p.p. whose constraint 
equations are described by A’x = b’ in section 4.4. The row operations 
which change the constraints from Ax = b, x > Oto A’x = b’, x=>0 
leave the solution and optimum value of the primal problem unchanged. 
Therefore, by the duality theorem, the optimum value of the dual 
problem is unchanged, but the row operations on A do change the 
dual constraints and the solution of the dual problem (see section 
6.2). 


Another observation, which comes directly from (5) in the previous 
section, is that we have a measure of near optimality. For any feasible 
vectors x and y, /(x)—g(y) is the largest improvement that can be 
obtained for either objective function. For example, in the /.p.p. in 
canonical form in section 3.4, x = (0.1, 3.6, 1.8, 0, 0.4, 0.2)’ and 
y’ = (—2.75, —0.75, —0.25) both satisfy the appropriate constraints, 
but neither is an optimum solution. As f(x) = ¢’x = —9.1 and g(y) 
= y’b = —11.5, an optimum solution for either problem will produce 
an improvement in the objective function value of at most 2.4. In 
this case both x and y are close to the optimum solutions, but this 
need not necessarily be the case in general when f(x) is close to 
&(y)(ER). 


Finally, since when solving a /.p.p. we automatically obtain the 
solution of the dual problem, in practice we should solve whichever 
of the two will be easier when converted to canonical form. This 
will generally be the one which involves the smaller number of equality 
constraints and so, for problems in which m > n, one would normally 
solve the dual problem directly. 
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5.6 Theoretical Consequences 

For any situation which gives rise to a Lp.p. the duality theorem 
provides a statement about that situation. These statements are 
sometimes celebrated results in their own right, e.g. the minimax 
theorem of game theory (see chapter 13) and theorems of alternatives 
for matrices (see chapter 6), and some, including these two, were 
established before the duality theorem. 

The elegant and subtle result we are concerned with in this section 
applies to /.p.p.s in general, and relates zero or non-zero variables 
at optimality with inequality type constraints which are active or not. 


Theorem 9. The Equilibrium or Complementary Slackness Theorem 
For optimum solutions of the primal and dual Lp.p.s in standard 
form, the constraints of either problem corresponding to non-zero 
variables for the other are satisfied as equalities 
Let x, and y, be optimum solutions. Thus 


Ax, = b, x = 0, yoA<c’, y= 0, 
and c’x, = yb. 


Hence YoAx,= ygb and y,Ax,<ce’x,, 
i.e. Yob < y{ Ax, < €7X,, 
andso  ygb = yZAx, =c’X,, 
and so yg(b — Ax,) = 0 = (yA — c7)xy. (1) 


Examining the left-hand equation of (1) we observe that y= 

and (b — Ax,) = 0, so that y¢(b— Ax,) = 0 if and only if 
(y,), @ — Ax,), = 0, i = 1,2, ..., m. 

Thus (y,), > 0 implies that (Ax), = 5,. 

A similar examination of the right-hand equation of (1) establishes 
the resultg 

For a general /.p.p. we expect the optimum solution to be non-degen- 
erate, and so (for m < n) the m constraints of the dual corresponding 
to the m basic variables of the primal are satisfied as equalities at 
optimality. This provides a straightforward method of verifying that 
a particular vector is optimum when the ‘simplex record’’ is not 
available: namely, solve the m dual constraints as a system of equations, 
and check that the solution obtained is non-negative (for standard 
form) and has the same value. 
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Example 

Consider the primal and dual problems discussed in section 5.3. 
The basic variables at optimality for the primal are x, and x, and 
the third and fifth constraints of the dual are satisfied as equations 
by the dual solution y, = (4, 3). 

Alternatively, with just the two Lp.p.s and a suggestion that el 
= (0, 0, $,0, $) is the primal optimum solution, the third and fifth 
dual constraints are y, + y, = 7, 3y, + y, = 15. Solving the equations 


(3 1)G:) = (5) » 


gives (¥1) = : . This value of {-') is non-negative and satisfies 
y2 y2 


all the constraints and (4, 3) = 39, verifying that x, is optimum. 
5 0 


Using the problem of section 5.3 to illustrate the equilibrium theorem 
suggests, correctly, that there is a corresponding result for the primal 
and dual problems in canonical form (see exercise 5.8). 

It is worth mentioning here, for those who have met Lagrange 
multipliers, that the dual variables are just the Lagrange multipliers 
for the problem 


minimise f (x) = e’x subject to Ax = b, x = 9, 
and that the equilibrium theorem corresponds to the Kuhn-Tucker 
conditions of non-linear optimisation (see {10}, {12}, {13}, {14}). 


5.7 Economic Consequences 

The basic point to be made is that the dual problem usually has 
a meaningful interpretation which provides useful, sometimes crucial, 
insights into the original problem. The dietician and the salesman 
with whom this chapter started are a good example. The dietician 
now knows that accepting the salesman’s offer will not produce a 
cheaper diet than one already available, and the salesman now knows 
that however he chooses his prices he cannot ensure for himself 
more than a certain fixed return. But more precise statements can 
be made connecting the real or natural economy of the dietician and 
the alternative or synthetic economy offered by the salesman when 
they are running optimally. 

For example, from the equilibrium theorem we see that the dietician 
supplies none of any food that is overpriced compared to its synthetic 
equivalent, i.e. of any food whose ‘‘real price’’ exceeds its ‘‘shadow 
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price’’. Also, the salesman will supply free of charge any nutrient 
whose requirement is exceeded in the actual diet. 

The dual of the transportation problem is left until chapter 10, 
but can be referred to now (see exercise 10.1). 

The dual of the manufacturer’s problem, exercise 1.5, is in standard 
primal form and can be interpreted as the problem of a rival manufac- 
turer with a takeover bid. The rival offers to buy the resources and 
wishes to do so at minimum cost. His offer will be unacceptable 
if the manufacturer would obtain more money by using his resources 
to continue to make the products. Here, for example, the rival will 
set a price of zero for any resource in surplus. Alternatively the 
dual variables provide a set of replacement prices for the resources 
which will exactly exhaust the manufacturer’s profit and (y,,,,), can 
therefore be interpreted as the implied or imputed value of one unit 
of the i-th resource. 

The relationship cx = y’b, where x is the current bf.s., is satisfied 
at every stage of the simplex method if y’ = @’B™' and é and B'' 
are also given their current values (see exercise 6.9). As the elements 
of x represent a feasible set of operating levels for the manufacturer’s 
n activities, the elements of y are the implied resource values corre- 
sponding to this manufacturing programme, so their interpretation 
as shadow prices is valid at all stages. For the optimum manufacturing 
programme, the equilibrium theorem shows that no activity is operated 
at a positive level if the activity would lose money if the resources 
used were costed at their implied prices. Also, if a disposal activity 
(slack variable) operates at a positive level the implied value of the 
corresponding resource is zero. 

The m-vector y is sometimes denoted by a and its elements called 
simplex multipliers or pricing multipliers. This is because the elements 
of y at any stage provide the multiples of the rows of A that have 
been subtracted from the cost coefficients ¢ to obtain the e.c.c.s 
ce’ (see section 6.1). (Remember the observation in section 5.5, that 
the simplex multipliers ¢’B' are just appropriate elements of the 
n-vector w, and also that, as the manufacturer’s problem is in standard 
dual form, the introduction of m disposal activities takes us immediately 
to part II with the original matrix of coefficients A unchanged.) 

The economic model of the manufacturer’s situation contains several 
ideas used to develop interindustry models of (e.g. national) economies 
using input-output analysis. Suppose the economy is divided into n 
industries or sectors, and in any one accounting period x, is the number 
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of units of output of the j-th industry, j= 1,2,...,m. If a,, is the 
number of units of the i-th industry’s output used to produce one 
unit of the j-th industry’s output, then b = x — Ax = (I— A)x gives 
the number of units of the output of each industry available for 
non-productive consumption. The elements of the matrix A are 
input-output coefficients and (I — A) is known as a Leontief matrix. 
Thus a production vector x which results in a specified final-demand 
vector b satisfies 
(I — A)x = b, x =0, 

or (I — A)x =>b, x = 0, 
and in practice would also have to satisfy certain minimum or maximum 
production constraints for certain industries, x => u and x < v. The 
coefficients of the objective function, whether to be maximised or 
to be minimised in the problem being examined, are chosen to reflect 
the particular objective involved. 

In an economic context the inverse of a square (m X m) matrix 
of coefficients is often of interest and we note that such an inverse 
is provided by the final simplex tableau (see also exercise 7.3). 


5.8 The Primal-Dual Algorithm 

This variant of the simplex method is more efficient for certain 
particular forms of Lp.p. and is a way of using the dual problem 
and the duality theorem directly to aid progress towards the optimum 
solution. The idea is to satisfy the conditions of the equilibrium or 
complementary slackness theorem (section 5.6) using slackness in 
the dual to make efficient choices of primal variables to become 
basic. 

We shall establish the algorithm from a theoretical point of view 
and use an example to illustrate some of the practical details. The 
algorithm solves the /.p.p. 


P: minimise c’x subject to Ax = b, x => 0 


in the case A JD I, without the artificial first part of the two-part 
simplex method, although there are still up to m extra variables 
involved. The savings over the two-part simplex method that might 
be expected are not always realised in practice so the algorithm is 
presented here as an interesting application of the duality results rather 
than as an improved méthod. There are no consequences for later 
chapters if this section is omitted. 
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Suppose we have a vector y satisfying the constraints y’A < ¢” 
of the dual problem 
D: maximise y'b subject to y’A < ¢" 
and we denote by J the set of constraint indices for which equality 
occurs. 
Thus jE J if (y’A),=c, 
and j€J if (y’A), <6, J = 1,2, ...5%. 


Suppose now that () is an Optimum solution of the /.p.p. 


ARP: minimise &”" , v, subject to Ax + v = b, 20,x,=0/7j¢ J, 
and suppose also that v = 0. Then x and y are optimum solutions 
of P and D because Ax +v = b and v=0 implies that Ax = b, and 
So x is feasible for P and 
y b= y’Ax = XF_, (y"A),x, = Dies (y"A),x, = Bes €,%, 
= Zi_, €,x, = ¢'x. 


The L.p.p. ARP may be written 
minimise (0", ey) subject to (A, I, (y ) ap, & ) > 0, 
x = 0, 

where e’ = (1, 1,..., 1) and K, is the (n—k) x” matrix whose rows 
are the unit vectors e, for j ¢ J, and k is the number of integers 
in the set J. This /.p.p. is called the Associated Restricted Primal 
and is the artificial problem of the two-part simplex method with 
the extra constraints K,x = 0. The vector y involved in the definition 
of ARP improves at each stage, i.e. as y’b increases, and as the 
number of integers in J increases so K, decreases in size. The dual 
of ARP, called the Associated Restricted Dual, is 

ARD: maximise u'b subject to (u’A), =0,j/E€J/, uxe(ER). 

One stage of the algorithm starts with a current vector y’ say, 
which is feasible for D, solves ARP, and hence solves ARD and 
uses this dual solution to produce y* say, which is feasible for D 
and which satisfies 

y*’ b>y’’b. 

So we denote by (¥ ) the optimum solution of ARP, where v’ 4 0 


or x’ is optimum for P, and denote by u’ the optimum solution of 
ARD. 
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If u’7A =< 0’, ie. (u’7A), < 0, j = 1,2,...,n, then P is infeasible, 
because for any a > 0 
(y’ tau’)’A=y"A+ au"Asc’+au"Asc’, 
and (y’ + au’)’b=y"b+aub= y"b+ ae’v’, 
so that there are feasible vectors for D with values unbounded above. 
Assuming P is feasible, there must be some j for which (u’"A), > 0, 
and we define 


, reed c, — (y’7A), : 
Q’ = mera Ue deer: 7 ae. o 
pulli2vcccce (u A), (u’’ A), 


For @,small enough and positive, 
y(0)"A = (y’ + Ou’)"A =y"A+ Ou” A XC’, 
because (u’"A), = 0 for (yA), = c,, and 
y(0)’b=y'’ b+ Ou" A>y"b. 

Thus y(@) satisfies the constraints of D and y(@)’b increases as 0 
increases. The maximum value that we can give to @ is 0’ and for 
this value we have y* = y’+ 0'u’. The set J is changed because 
the column index ¢ has to be included, but column indices corresponding 
to positive x, remain in J(ER). This gives a new ARP and takes 
us to the beginning of the next stage. 

It is important to see that solving ARP at each stage consists simply 
of a normal simplex stage for P with x, becoming basic. The vector 

* is feasible for the next ARP because the constraints are just 
those of the current ARP except that one is removed, and the only 
way 4 could be improved is to increase x, from its current value 


of zero. Provided that increasing x, does reduce the value of the 


new ARP from that given by 7 , then the pivotal operations 


@ and @) of the simplex method (section 3.2) will optimise the 
new ARP. The operations are equivalent to a simplex stage for P 
(without calculating c’) with a different motivation for the choice 
of pivotal column. To prove that the value of the ARP decreases 
it is sufficient to prove that in the e.c.c.s. for the ARP, c) < 0. 
Now the previous stages, as they consist just of row operations on 
the coefficients (A, I,,b), can be represented by pre-multiplication 
by a matrix B', so the e.c.c.s. for ARP are given by 


c’’ =c’ —¢c’B'A, 
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where c’ = (0’, e’) and @’ is as defined in section 5.4. 

As we have seen, ¢’B'' gives the dual variables at optimality, 
so that e—e’ = u’‘A, and in particular c, — c/ = (u’A),, and hence 
= 0. 

A similar argument to that used for theorem 5 (section 3.5) can 
be used to show that the primal-dual algorithm reaches the optimum 
solution in a finite number of stages. 

The example below shows that the stages of the primal-dual algorithm 
can be performed in a sequence of tableaux similar to those of the 
simplex method. 

In this situation the rows labelled -u’’A correspond to the c’-rows 
of the simplex method and are obtained in the same way, as are 
the rows of the following tableau once the pivotal element has been 
chosen. The extra row labelled e’ = y’’A is included for clarity 
and so is the 6-row of ratios (c, — (y’A),)/(u’A),, which of course, 
must not be confused with the 6-column of ratios which determine 
the pivotal row. 

Beneath each tableau are listed the various quantities mentioned 
in the description of the algorithm. Notice that for examples in which 
A = 0 and c = 0, which is the case here, the feasible solution of 
the dual needed to start the algorithm is given by y = 0. 

Minimise x,+2x,+x, + 6x, 
subject to (2 : 1) = (4 ) x= 0. 


-_ 


0 
‘er oe 2 Oe --0. 0B [A2 
2 @ tte:2rt utes 
oe “foe” Oe 58 

—u'"A 33 2 4 -3'|0 0Oj}-7 
ce 2 oF ae 6 
7) 2: Ree 2 
sy 


= (0,0), y’"A = (0, 0, 0, 0), J is empty 
)= (0, 0, 0, 0, 3, 4), value = 7, 

rt 

4° 


1T 


a, Ds u’7A = (3, 2, 4, 3), u’"b = 7, 


il 
u 
d= = (45 Z); y* b=. 4 
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— 0,13) = (1, -), uA = (3, 5, 9, 3), 


y- =(4,; 
($))=0.0.5.0,5.0, rahe 8 
u = 

6=4, y* =(1, 0), y* b= 3. 
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y’” = (1,0), y "A =, 1, 1,2), J = fl, 3), 
(*) = (2, 0,0, 0, 1,0), value = 1, 

u’” = (1, -4), u’7A = (0, 4, -4, 2), u’7b = 1, 
0 = 


= 2, y* = (3,-1), y*”b=5. 


0 0 0 0 ] 1 
y’” = (3,-1), y’"A = (I, 2, 0, 5), J= (2, 1), 


(¥ ) = (1, 2, 0, 0,0, 0), value = 5. 


In the last (incomplete) tableau, the optimum solution (¥] of 
ARP has v’ = 0, so we have the optimum solution of the primal 
problem given by x = (1,2,0,0)”. Notice that the optimum solution 
of ARD, u’, is obtained at each stage by subtracting the elements 
in the -u’’A row corresponding to v from the vector e. This is just 
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applying the formula of (1) section 5.5, or (6) section 6.1, to ARP 
and ARD, where c’ = (0, e’). 
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Exercises 5 
1. Establish theorem 6 by starting with the canonical primal and 
converting it to standard primal form. 
2. Establish theorem 7 using standard form instead of canonical 
form. 
3. Obtain the dual of the L.p.p. 
maximise ¢’x subject to Ax = b. 
4. For the L.p.p. 


minimise x, — 3x, + 2x, subject to x,, X,, ..., X5, =O 


ee ee | O22. @ il 
and |0 2 4 L 0 Ox= 12 
| ee | eee a | 10 


verify all the features of proof | of the To theorem: 


tiilol WED te hae 
Simplex tableaux for this problem are as follows: 


0 
 SSy eee ee 
o2@ +t 8 81/121) 42 
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en Bee ee eee eee 

, Oe ie) ae BS a 
oe, oe ae OR ae 


5. Forthe/.p.p.s of exercises 3. 1(i) and 3.1(ii), write down the solution 
of the dual problem and use the duality theorem to verify that 
the primal solution is optimum. 

6. For the .p.p. in 4 above, write down the matrix operations of 
each stage, and hence write the solution of the dual in elementary 
product form (see section 3.7 and equation (2) of section 5.4). 
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Fi 


a 


Prove that 
primal infeasible does not imply dual has unbounded 
feasible solutions. 
Hint: construct a counter example with, for example, m = n= 2 
and primal constraints inconsistent. 


. State and prove, for the primal and dual /.p.p.s in canonical form, 


a theorem corresponding to the equilibrium theorem. 


. For the .p.p. (in standard dual form) 


maximise 2x, + 4x,+x,+x, subjectto x,,x,,x,,x,20 


E30 @ = 4 
and (3 1 0 0) x= (3) 
Ook oe. ol 3 


use the equilibrium theorem to verify that x = (1, 1, +, 0)’ is the 
optimum solution. 


. In the case of a degenerate optimum solution of a /.p.p. in standard 


primal form, are dual constraints corresponding to basic variables 
with value zero active or not? 


. In the light of the duality theorem discuss the salesman’s objective 


and whether he should modify it in order to sell his products. 
Discuss also the *‘package-deal’’ aspect of the dietician / salesman 
and the manufacturer /rival situations. 


. Solve the l.p.p. below by the primal-dual algorithm, and note 


that the two-part simplex method would involve more work. 
Minimise 2x, + 4x, + 3x, + 3x, subject to 

$4} f)e=(4) rao 
The description in section 5.8 of the primal-dual algorithm shows 
that at each stage a column index f¢ is chosen to enter the set 
J. Does this imply that the algorithm always finds the optimum 
solution in at most n stages (assuming nondegeneracy)? 
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NOTES 


CHAPTER 6 


DUALITY CONTINUED: A MATRIX VIEW OF THE 
DUALITY THEOREM; THEOREMS OF 
ALTERNATIVES 


6.1 Second Proof of the Duality Theorem 

This is really a compact version of the first proof, and takes an 
overview of the whole simplex process. 

We again assume that the /.p.p. is in canonical form with A D i. 
that the problem is solved by the simplex method and that at op- 
timality x,,x,,...,x,, are the basic variables corresponding to I, in 
the first m columns of the final tableau. 

As we have done before, we partition A, x and ¢ into an mx m 
and an m Xx (n—m) matrix and m- and (n— m)-vectors respectively, 
and we denote the first m columns of A by B. Thus the /.p.p. is 

minimise ¢'x subjectto Ax=b, x=0, 
which becomes 
minimise ¢,x, + ¢/x, subject to 
Bx, + A,x, = (B,A,)x = b, (x!) = 0. (1) 
2 

Since the overall effect of the simplex stages is to reduce B to 

I,,, the constraint equations as given at optimality must be 


and the optimum solution is given by 
x, = b’ = B'b, x, = 0. (2) 
The crucial step is to identify the vector c,, Of e.c.c.s at optimality. 
This is obtained by subtracting from ¢ multiples of rows of A, which 
is equivalent to subtracting from ¢ multiples of B-'A, the coefficient 
matrix in the optimal tableau. The defining property of c),, is that 
(c),.), = 0, §=1,2,...,m, so that 
Cor =e — 2”, c, x (i-th row of B"'A). (3) 


opt 


Thus 
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cr, =e’ —ci (BA), ch, =e — CB )Az0’," (4) 
as c’ => 0 is the optimality criterion of the simplex method. 
Alternatively, we can remember that the e.c.c.s express the objective 
function in terms of non-basic variables, so eliminating x, from 
f = ¢’x gives 
f(x) = eK = e7 x, +X, 
=¢;(B'b —B''A,x,) +e) x, (5) 
= |B 'b + 0’x, + (c; — c/B 'A,)x,, 
sothat ¢/% = (0’,c7—c|B 'A,) 
= (ce; —¢;, ¢; — ¢; B’'A,) 
= (c; — ¢,B"'B, ¢; — ¢, B-'A,) 
ce’ —c) B'A again, and at optimality 
c{/B 'b, because x, = 0. 


f(x) 


As in the first proof we now consider the vector y, = c) B ' (notice 
that the vector ¢” in the first proof, which is defined there at every 
stage of the simplex method, is here just 


c/ =(c,,¢,,...,¢,,) at the optimum stage). 
Now y, is feasible for the dual problem because 
yA=c,B 'A<c’, 
and g(y,) = yob = ¢/B 'b = f(x,,,), and thus, using the argument 
from (5) of section 5.4, we establish the duality theorem. 
It is worthwhile recalling the observation near the beginning of 
section 5.5. The solution of the dual is in fact given by 


Yo =¢’—@,, where (€),=c,, 
and j, is the column index in A of the i-th column of I. 

We also recall the n-vector w’ of section 5.4 and observe that 
at any stage of the simplex method, using B to denote the mx m 
matrix of columns of A corresponding to basic variables, 

w= a7 A’ BS | @7(B'A) & (¢’B')A = ec? = ce’? 
so that ¢’? = ¢’ — (@’B")A. 
Thus, with a suitable interpretation, the equations of (3) and (4) hold 
throughout the simplex method and the vector ¢’B ' gives the multiples 
of the rows of A that have been subtracted from ¢’ to obtain the 
e.c.c.s e’". 

A comparison of this section and section 5.4 demonstrates very 
clearly the simplicity and clarity that results from describing the simplex 
method explicitly as a sequence of matrix operations. 
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6.2 
It should be emphasised that the dual solution Y, in the previous 
section refers to the primal problem as defined at the end of the 
first part of the two-part simplex method. In the first part, using 
the artificial objective function z, + 2, +... +2Z,, (see section 4.3), the 
augmented matrix of coefficients of the constraint equations (A,b) 
say, which does not contain I,,, is converted to (A,b) say which 
does contain I. The conversion is effected by simplex stages each 
of which is equivalent to pre-multiplication by an elementary matrix 
Ej (see section 3.7), and the overall effect is that of pre-multiplication 
by the inverse of an mx m matrix Q say, which is the m columns 
of A which become I, in A. Thus 
Ef ... EX E*(A,b) = (A,b), 
Q '(A,b) = (A,b). 
The matrix Q-' will appear at the end of the first part in the m 
columns of the tableau corresponding to the artificial variables, and 
at that stage the first m columns are the matrix we have called B. 
Consider again the example from section 4.4. In our present notation 


i ittel 02), 


1 
ak i SS ae ON ae . a 
(A,b) = « aa a at B= ( _ 
2 2 2 2 2 


(because x, and x, are basic variables in the optimum tableau), and 


_fi -l wan eee ey ee oe 
o-(; 2 ha )-(0 3 es eek 
Denoting by jy, and y, the dual solutions to the primal problem in 


the forms 
minimise ¢"x subjectto Ax=b,x=0 and 


(6) 


minimise c'x subjectto Ax =b, x =0 respectively, 

it is easy to relate ¥, and y,. Because if Y, Satisfies 

yoA =c’, yob = g(¥,) =S(x,,,) = ¢7B'b (7) 
and y, satisfies 

yoA x ", Yob = 8(Vo) = Sx 
then substituting for A and b gives 

¥o(QA)=c’ and ¥7Qb= c7B'b. 

From (7) it is clear that 


yi of BTA; (8) 


opt 


¥o=F0Q"'. 


84 LINEAR PROGRAMMING AND APPLICATIONS §6.3 


If the columns of the simplex tableau in the first part were not discarded 
then they too would be subjected to pre-multiplication by B ' during 
the second part and would provide ¥; in the same way that the final 
tableau provides y,, because 


Jo =(¢;B ')Q' = cB 'Q‘'). 
In section 5.5 the solution of the dual problem as defined by the 
primal at the end of the first part of the example quoted above 


was y, = (2 . Hence the dual solution to the /.p.p. originally given 


4 is 
0 
a.n( | ) = (2,3) = Fo 


and we verify that y, satisfies the dual constraints, 


in section 


Nie nie 


229(| Abie ae -] 1,4, 1,2,-3) < (1,5,2,2,7 
sien ¥ 2 \ —_ = (1,4, 1,2,-3) = (1,5,2,2,7), 


and has the same value as the primal optimum, 


(2,3) (5) = hf, 


Notice that the above analysis holds for any matrix Q not just the 
one identified by the first part of the two-part simplex method. Thus 
if a primal /.p.p. with constraint equations Ax = b is converted to 
an equivalent /.p.p. by row operations on (A,b) defined by Q, then 
the new dual solution is the previous one multiplied by Q '. 


6.3 Theorems of Alternatives for Matrices 

A number of interesting results which state that one of a pair of 
mutually exclusive possibilities concerning a general matrix A must 
be true, can be regarded as immediate consequences of the duality 
theorem. 

In fact these theorems were established about fifty years before 
the duality theorem and part of one of them, the theorem of the 
separating hyperplane, is often used to provide a proof of the duality 
theorem which is independent of the simplex method. This is the 
approach used in the third proof of the simplex method; see Appendix 
2: 
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Theorem 10. Farkas’ Lemma, or the Theorem of the Separating 
Hyperplane 
For any m Xn matrix A and any m-vector b 
either (i) there is an n-vector x such that x=>0 and Ax =b, 
or (ii) there is an m-vector y such that y’b<0 and y’A=>0'@ 
Strictly (i) false implies (ii) true is Farkas’ Lemma, i.e. if there 
is NO non-negative vector x such that Ax=b, then there is a vector 
y such that y’b < 0 and y’A = 0’. This is also called the theorem 
of the separating hyperplane because of the following interpretation: 


let a,,a,,...,a, be the m-vector columns of A and let G be the 
set of all non-negative linear combinations of a,,a,,...,a,. Thus 
z€G if z=Ax for x=0. The set G is convex (ER) and is in 
fact a convex cone. (A set G is a cone if 2€ G implies az € G 
for all a= 0.) The vertex of the cone is the origin and corresponds 
tox = 0. 


/ 
separating hyperplane —> | 
/ 


The theorem says that any point b either belongs to G or can 
be separated from G by a hyperplane. The hyperplane is defined 
by y and consists of all points z such that y’z = 0, and it separates 
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b and G because the projection of b onto y, y’b, is negative whereas 
the projection of any point of G onto y is positive, yA = 0’, ice. 
ya,=0,/=1,2,n. 

The theorem may be proved as follows: 
1. (i) true implies (ii) false. 

Let x>0 satisfy Ax=b and consider any y such that y’A>0’. 

Then (y’A)x = 0 and therefore y’b=0, because Ax=b, and 

therefore (ii) is false. 
2. (ii) false implies (i) true. 

Note that it is not sufficient now to prove (ii) true implies (i) 

false because this leaves the possibility that both (i) and (ii) are 

false. 

If (ii) is false then there is not a vector y such that 

y’(-A) <0" and y’(-b)>0. (1) 
The canonical dual /.p.p. is 
maximise y'b subjectto y’A<c’ 

and in this case we have c = 0. Now y = 9 satisfies the dual constraints 
and y’b has value 0, and by the assertion (1) y=0 is the optimum 
solution. By the duality theorem the canonical primal /.p.p. has an 
optimum solution (with value 0, which agrees with c=9) and thus 
the canonical primal is feasible, i.e. there is a vector x such that 
x = 0 and Ax = b. Thus (i) is trues 

Similar results, including two theorems of alternatives for matrices, 
are mentioned in exercises 6.5, 6.6 and 6.7. 
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Exercises 6 
1. Obtain the solution of the dual problems of the |.p.p.s of exercises 
4.1, 4.3 (ii). 
2. The Lp.p. 
minimise’ 2x, 3x, — %, (subject to. °x,, x), X3) ¥=0 
and 


a SF ee 2 
1 —2 4 O}fx={2 
aad. Set game 4 


is solved below. Obtain and check the optimum solution of the 
dual /.p.p. 
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. For the case A = 
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. Suppose the /.p.p. in canonical primal form 


minimise c’x subjectto Ax=b, x=0, 

where A is m X n, has solution x, and suppose that the dual problem 
has solution yy. The j-th equality constraint of the primal problem 
is now replaced by the (j-th constraint) + (A X the i-th constraint). 
What is the solution of the new dual problem? 
! 

3 1 and b = € 
of theorem 10 holds. Draw a diagram of the situation in the (y,,y,) 
plane and by inspection find the equation of a separating hyperplane 
(a line in this case). Explain how the equation of a separating 
hyperplane could be found in general. 


) the assertion (ii) 


. One theorem of alternatives for matrices states that for any 


m X n matrix A and any m-vector b 

either (i) there is an n-vector x such that Ax < b and x = 0, 

or (ii) there is an m-vector y such that y’A > 0’, y = 0 
and y’b < 0. 


Prove this theorem using the duality theorem. (Hint: connect 
the relationships in (i) and (ii) with the standard dual and standard 
primal J.p.p.s respectively.) 


. Use the duality theorem to establish the following theorem: Gordon’s 


Theorem: For any m X n matrix A 

either (i) there is an n-vector x such that Ax=0, x = 0, x 4 0, 

or (ii) there is an m-vector y such that y’A > 0’. 

(Hint: (i) is equivalent to: 

there is an n-vector x such that Ax=0, x=>0, e’x = 1, where 
T 

e = (lbh? 


. Another theorem of alternatives for matrices: prove that for any 


m Xn matrix A and any m-vector b, 
either there is an n-vector x such that Ax = b, 
or there is an m-vector y such that y’A =0 and y’b#0. 


. Establish a stronger version of the equilibrium theorem for canonical 


form (see exercise 5.8), namely that dual constraints corresponding 
to primal variables basic at optimality are satisfied as equalities 
by the optimum solution of the dual. 


§Ex. 6 DUALITY CONTINUED: A MATRIX VIEW 89 


9. Show that the relationship e’x = ¢’B 'b (see section 6.1) holds 
not just at the optimum stage, but at any stage of the simplex 
method, with ¢ as defined in section 5.4, and B the mx m matrix 
consisting of the columns of A corresponding to the basic variables 


of x the current b/s. 
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NOTES 


CHAPTER 7 


THE REVISED SIMPLEX METHOD 


73 

As we have described it in section 3.2 the simplex method involves 
replacing a current system of equality constraints A’x =b’ by an 
equivalent system A*x = b* at each stage and finding a corresponding 
new set of e.c.c.s. Nowadays this is very rarely what is done in 
practice. Two considerations motivate us to review our interpretation 
of the method and hence to produce a revised scheme for organising 
the calculations at each stage. The first is that we are really only 
identifying and solving an m Xx m system of equations. In doing so 
some (possibly most) of the variables will remain non-basic throughout 
the process, so that the arithmetic operations performed on the elements 
of the corresponding columns are, in a sense, unnecessary. The second 
is that when solving an m Xm system of equations on a computer 
we know that the arithmetic operations are not performed exactly 
and that special methods should be used to minimise the effects of 
the arithmetic errors. One recommended method, called Gaussian 
elimination with interchanges, is described in Appendix 3 together 
with implications for the (revised) simplex method. In this chapter 
we concentrate on the first aspect and, since both parts of the two-part 
simplex method consist of solving by the simplex method a Lp.p. 
in which A I,,, we can take as our starting point the Lp.p. 

minimise ¢’x subjectto Ax=b, x=0, 
where A D I. 

AS we saw in section 3.7 each stage of the simplex method consists 
of premultiplying the current system (A’,b’) by E* to get (A*,b*). 
Thus, at the end of the (k — 1)-th stage we have 

(A’,b’) = Ef Ex, ... EZ Et (A,b), (1) 

and ¢é.c.c.s 
c’7 =¢' — "(Et ... EXE*)A, (2) 
where (¢), = c,, and the j-th column of A’ is the i-th column of 
I, (see section 5.4). If we denote by B the m x m matrix whose 
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i-th column is the j,-th column of A (consistent with our notation 
in sections 5.4 and 6.1), then E*, ... Ef E* = B'', and the three 
sets of coefficients which define the /.p.p., currently A’, b’, ce’, are 
given by 
A’=B'A, b’=B'b and c’’=c’ —@’B'A. (3) 
The next stage can be described as follows: 
(i) Find min Cc = min(c" — e’B 'A), 
(if ¢’ > 0 we wie the optimum solution). 
PAS CORIER SSO ey (2-) 1 7 
(ii) Find’ min = ={— min (B b),/(B a,,),. 
i=1,2,... @, i, i=l 
a >0 (Ba), >0 
(iii) With s and ¢t defined, evaluate Et. 
(iv) Replace B' by E* B". 
(v) Replace the s-th element of ¢ by c,, and /, by ¢. 


We now have the situation described by (3) again, with A*, b*, 
c* defined by the same equations but using the new B' and @, so 
we can repeat steps 1, 2, 3, 4, 5 until the optimality criterion is 
satisfied. Remember that the actual matrix B' is stored, so we do 
not have to calculate the inverse of the matrix B. 

This approach, in which we use the original coefficients in A, b 
and ¢ together with € and the matrix B' instead of the equivalent 
coefficients in A’, b’ and c’, is called the Revised Simplex Method. 

The precise implementation in practice has various alternatives, 
some of which are discussed in section 7.3. 


7.2 

It seems to be the custom to rejoice at this point at having found 
such an efficient improvement over the simplex tableau approach. 
Instead of calculating a completely new tableau we only have to 
calculate the new e.c.c.s, the new values of the current basic variables, 
B 'b, the column vector B'a,, and the new B', as in steps 1, 2, 
4in the previous section. The celebrations are quite misguided, because 
the revised simplex method is no faster than the tableau approach 
for general matrices A. 

In both cases we can take account of the fact that I,, is present 


and so regard the tableaux A and A’ as mx (n—™m) matrices. What 
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is usually overlooked is that evaluating ¢’B''A involves the same 
number of arithmetic operations as forming A* from A’. Essentially, 
if we put €“B' = d” then evaluating d’A requires m(n—m) multi- 
plications and (m— 1)(n—m) additions, whereas evaluating A* (see 
@® and ® of section 3.2) requires one division, m(n — m) multiplica- 
tions and (m — 1)(n — m) additions. 

Basically both approaches require m(n—m) additions and multi- 
plications for the main part of the calculations at each stage. However, 
the revised simplex method, in one or other of its implementations, 
is now the standard method for solving /.p.p.s in practice, so we 
need to see why. 

The crucial point is that the revised simplex method always involves 
the original matrix A, instead of a sequence of changing matrices 
A’. For most large problems in practice A is sparse, that is most 
of the elements a, are zero (less than 20% non-zero is not uncommon), 
and of course all the arithmetic operations involving addition of zero 
or multiplication by zero can be omitted. Remember that in theory 
replacing a, by a, +0, or multiplying 0 by a,, is the same as omitting 
the operation, whereas in practice many (perhaps most) computers 
take as long to add zero or multiply by zero as they do to add or 
multiply by any other number. 

The tableau operations tend to /ill-in the zero elements so that 
A’ becomes less and less sparse, and so even if we were to avoid 
actually performing operations with zeros there are more operations 
to perform at each stage. On the other hand, the revised simplex 
operations are to evaluate d’ = ¢’B'', where neither of @7 and B' 
is to be regarded as sparse (although B' certainly is initially), and 
then to evaluate d’A. When A is sparse, many empty operations 
(involving zeros) can be omitted, and exactly the same operations 
al every Stage. 

This then is the reason for the greater efficiency of the revised 
simplex method, but the savings are non-existent if A is not signif icantly 
Sparse, and are not realised if we do not take advantage of the 
sparseness. This suggests, correctly, that a good revised simplex 
computer program is quite a complicated affair. In practice, a sparse 
A is not stored as an m Xn matrix at all; instead only the non-zero 
elements together with their row and column indices, a,, I, j, are 
stored and the arithmetic operations are organised in terms of this 
information. 
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There are three distinct ways of implementing the revised simplex 
method, although in practice different aspects of each can be combined. 
In the brief discussion of each that follows, we assume that advantage 
will be taken of the sparseness. 


(i) 


(ii) 


(iii) 


The implementation can be explicitly as described in section 7.1, 
with B' stored as an m X m matrix and ‘‘updated”’ at each stage 
by pre-multiplication by E*, and b’=B''b, d’=@’B"' and d’A 
evaluated as vector X matrix products. Note that Ef has a very 
simple form and E*B ' would not be evaluated as a general matrix 
product but rather as a sequence of row operations on B '. Also, 
we would probably have an m-vector j, whose elements /,,/,, ....Jm 
are the column indices of the columns of I, so that 

4.2 CR", K=1,2,°.., 
is given by 27", c, (B'),- 
Instead of storing B' explicitly as an m X m matrix, we can 
store it implicitly in product form, because Ef is obtained at 
each stage and 

B'=ESE!_,... E2E*. 

Then an expression such as B_' bcan be evaluated by evaluating 
successively E*b, Et (E*b), ES (E% (Ef b)) etc. Remember that 
each of these products can be evaluated efficiently by taking 
into account the special form of Ef, and that to store E* we 
only need to store the single non-trivial column of Ef together 
with the corresponding integer column index. Thus Ef, E},..., EX, 
can be stored in the same amount of space that B"' requires. 
More stages would require extra storage space, or a compromise 
between the two approaches. The advantage of this approach 
is that each column vector representing an Ef will be sparse 
if A is sparse and this can be used to save storage and to reduce 
the number of operations performed. 

The third approach brings us back full circle to the observation 
made near the end of section 2.9, that solving a /.p.p. really 
only requires solving an m X m system of linear equations. 
The three equations involving B ' at each stage are 

d’=¢’B', b’=B'b and a,,=B'a,,. (1) 
These can be regarded as three m x m systems of linear equations 
for the unknown vectors d’, b’ and a‘, which all involve the same 
matrix of coefficients B. The matrix B consists of the m columns 
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of A identified by the integer elements of the vector j of the 
previous section. Thus instead of storing B', updating it at each 
Stage and explicitly forming the products (1), we can solve the 
three systems of equations y) 
Ba,, = a,,, Bb’ =b and B’d = ¢. (2) 
This would appear to be very inefficient, but the three systems can 
be solved with little more effort than is needed to solve one of them 
(see Appendix 3), and it is possible to update information from the 
previous stage to avoid most of the calculations required (see {12}, 
chapter 1.2 of {8} and also exercise 7.3). The important aspect of 
this approach is that a method of solving the equations that is known 
to produce satisfactorily accurate solutions can be used. In addition, 
one can periodically revert to the equations (2) and solve them without 
reference to earlier stages, to prevent successive arithmetic errors 
building up. 
This approach ensures that the simplex method is numerically stable, 
and should be the standard approach in practice. 
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Exercises 7 

1. Assuming that A is not sparse, evaluate precisely the number of 
arithmetic operations (additions, multiplications and divisions) 
needed to perform one stage of the simplex method 
a) using the tableau, 

b) using the 7.3(i) implementation of the revised simplex method. 

2. Re-solve, using implementations 7.3(i) and 7.3(ii) of the revised 
simplex method, a/.p.p. previously solved using the simplex tableau, 
e.g. the problem in section 3.4, the problem in section 4.4, exercise 
4.1. 

3. Suppose an n X n matrix B, is obtained by replacing the s-th column 
of an nXn matrix B by an n-vector a. Explain how B,' may 
be obtained efficiently if B'' is available. (See step 4, section 7.1 
and section 7.3.) 
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CHAPTER 8 


PARAMETRIC LINEAR PROGRAMMING AND 
SENSITIVITY ANALYSIS 


8.1 

If a .p.p. is solved, and then a small change is made, such as 
one coefficient a,, b,, or c, changed, or one constraint removed, 
one would hope that the solution of the new problem could be obtained 
without having to start all over again. For certain changes this is 
the case and so the effects of, say, a changing price or a change 
in resources can be determined efficiently. Doing so effectively 
determines the sensitivity of the solution to the particular coefficient 
or constraint involved. An introduction to this aspect of linear 
programming is given in the sections 8.3 to 8.6, by considering four 
particular changes. This section and the following section are concerned 
with a similar aspect in which the objective function /(x) depends 
linearly on a parameter A, and we require the optimum solution as 
a function of A. This is usually called parametric linear programming, 
although this term would also be appropriate if a parameter were 
present in A or b. 

The vector of cost coefficients may be denoted by e(A) =c+Ad, 
so that the objective function f = f(x,A) = (¢ + Ad)’x = e’x + Ad’x. 

Suppose that the interval A, = A < A,, is of interest, and as usual 
assume ADI. The /.p.p. with A = A, may be solved in the normal 
way, but in terms of the tableau approach, we can replace the c-row 
of e.c.c.s by two rows, a c-row and a d-row, which are initially 
c’ and d’ respectively. The e.c.c.s at any stage are given by 
e’(A) =e’ +A,d’, where c’ and d’ at every stage are each obtained 
by an appropriate row operation. This presents no difficulties; the 
usual c; becomes cj + A, d/ and so the optimality criterion is 

c (A) = é'+ A, d, = 0, j=1,2,..., 

There are two possibilities: 

(i) for A = A, an optimum solution is obtained, and 
(ii) for A = A, and some /, at some stage we find 


e(A)<O0''and a,.=0, i=1,2,...,.m 


99 


100 LINEAR PROGRAMMING AND APPLICATIONS §8.1 


If (i) is the case, we would expect that in general A can be moved 
from A, without violating the conditions 


c} (A)= 0, f= 1,2,..., 0, (1) 
i.e. for some range of A, A_<AXA,, c; (A) = 0, where A_<A,<A,. 
For dj <0 we know that A, = —c//d/, and we require A= —c,/d,. 
~c} /d} 
A, v d,<0 
+ 3 9 
For d/ > 0 we know that A, = —c//d/, and we require A> —c//d). 
-c}/d} 
v A, d;>0 


Thus the inequalities (1) are satisfied by A. =A <A,, 
where A_ = max, = ¢,/ ds 
J31,2,.558 
qd >0 
A, = Win” = CFa, ; 
j=l, ...8 
d; <0 
and A_, A, are —oo, +00 if all d’ are <0, =0 respectively. 

The current optimum solution x, remains the optimum solution for 
A_SASA,; A_SA, and the value of A_ may or may not be of 
interest. If A, = +o or A, = A,, the parametric /p.p. is solved, 
and there is a single optimum solution for the whole of the range 
of interest of A. Note that although the optimum solution x, does 
not change, the value of f(x,,A) varies linearly with A. 

If, however, A, is finite andA, <A,,, then we must have A, = —c’/d/ 
for some t, 1<t<n, and so for A>A,, c/(A)<0, and if a) <0, 
i = 1,2,...,m, the Lp.p. with A>A, has feasible solutions whose 
values are unbounded below. Otherwise a/,>0 for some i, |<ix<n, 
and if we perform the pivotal operations of the simplex method with 
a’, as pivot we obtain an optimum simplex tableau (for A = A,) 
in which x, is a basic variable. This returns us to the beginning of 
case (i) with A, instead of A,, so we put A, =A, say and repeat 
the procedure. The next time we find A, = A,, or feasible solutions 
with values unbounded below or we find a new A,, say A,. Thus 
we generate a sequence of characteristic values {A,}, A, SA,=A,...= 
A,, A, =Ay- It may happen that A, = A,,,, but it can be shown that 
the set of basic variables (and generally the optimum solutions) 
corresponding to A, and A,,, are different, and cannot occur again. 
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If (ii) is the case, either for A=A, or for any A=A,, then the 
Lp.p. for this value of A (we shall use A = A, for convenience) 
has feasible solutions but no optimum solutions. Suppose that 
e(A,) <9, ie. cf +A, d/ <0, then if d’ <0, c/(A) <0 for any A= ans 
and with a/,=0, i= 1,2, ...,m, we conclude that there is no optimum 
solution for any A>A,. On the other hand, if d’>0, then c’(A) <0 
for A, <A< —c!/d'. 

So consider A= A’ = —c!/d/. If A=A,, there are feasible solutions 
with unbounded values for A, <A<A,,. Assuming that A’ <Ay we 
essentially just return to the simplex method. If ca ye20; jo l,2,....0, 
we have an optimum feasible solution, so we‘ have case (i) with A’ 
instead of A,. If c/(A’) < 0 for some j, we continue the simplex 
method, and with A = A’ we arrive again at one of the two possibilities 
(i) and (ii), and we continue until either A’ = A py Oa, 2 Ay. 

The procedure in practice is simpler than the above analysis suggests, 
as the example in section 8.2 demonstrates. We observe that the 
situation at a characteristic value is essentially that with which exercis 
3.6 is concerned. " 

We also observe that if a I.p.p. is solved and then ¢ is changed 
to c, say, it is a simple matter to solve the modified problem. We 
just replace ¢/,, by ¢,, convert to equivalent cost coefficients ¢’ by 
the operations described in section 4.2, and if ¢; < 0 for any j we: 
just continue with the simplex method. If c’ = 0 then the optimum 
solution is unchanged, but the optimum value changes from ei), 
to ¢5x,,,. We consider one aspect of this in more detail in section 
8.6. 


8.2 Example 
Solve the /.p.p. 


minimise {(-1,-2,-1,0,0,0) +.A(1,0,-3,2,0,-6)}x 
subjectto x=Q0 and 


2 D Pty Fed 2 0 2 
2-2) = 0 1 Ols=1 6}, » 
eS i a 6 


for0 <A <0, 

This is the problem of section 3.4 with ¢ replaced by c+ Ad, where 
d” = (1,0,-3,2,0,-6), and with A, = 0, A,, = o. 

Solving the /.p.p. with A= A, =0 is the same as solving the /.p.p. 
of section 3.4 so we can start with the optimum tableau of that section, 
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just adding a d-row, and converting to the appropriate form with 
d; = dj; = di = 0 because x,, x, and x, were the basic variables 
at optimality. 

Thus the first stage in this case is given by 


3 0 clt, 1: Obpad wie oda 4446 
Oe ne ee, a. a ee (2) 
oO” OF On 28 4 1 0 
6 ~ O10 ition Fue 440 
l 0 -3 2201Ove Sihed 
‘ 4/19 90 Lowedo 0+ © *, 
4 
Here we have case (i), so 
A_ = max {—§} = —3, 
A, = min {31,2 =t=A,, t=5, 


and hence via the 6-column s=2. Pivoting on a}, leads to the next 
tableau. 


2hoftipom-pa: ioe OF wOo Pomnig we 
oe ee ei, ep we 
248 oS Wiketle ehalid diy (3) 
Duce® sedov Zomlio of dott 
Ib s@om & ek lo Ooqey apa 
éi 
e'(A,)" s € © £5) Bae 


Thus x,,, = (0,2,0,0,8,4)’ when A=A,= +4, and we have added the 
row e’(A) for A = A, to confirm that ¢’(A,) = 0. For A =A, the optimum 
value is —12. This is again case (i) so 

A_ = max {—4,, 3} = 4 (=A, of course), 

A, = min{+3} =4=A,, t=4, and s=1. 
Thus we obtain 


2A A AD 8 eA 
2 eh 3 8 oe eae 
poy wate ae Pear ee (4) 
-l 2 -1° 0 O+ 07] 0 
rs hee. Wine ae Sa es Wo 
e’(A,)” = ©  § 8. Bo tae 
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Thus x,,,=(0,0,0,2,6,6)’ when A=A,=4, and we have again 
added e’(A)’ to confirm that ¢’(A,) = 0. 

The characteristic values of the parameter A are 4 and } and 
for OSA <3 (-}SAS;jinfact), x,,, = (0,4,2,0,0,0)" =x, say, 
for 453A 5s}, Xopr = (0,2,0,0,8,4)" = x,, and 
for..3=.A, Xop, = (0,0,0,2,6,6)" = x,. 

We observe that 
S(@~) = (c" + Ad")x = -12 for A=+,x=x, and 

f(x)=-12 for A= 4 X= X,. 
We also observe that tableau (4) is the initial tableau for this problem 
from section 3.4: the two stages in this instance have just reversed 
the stages for the original version of the problem without the parameter 
A. 


8.3 Removal of a Constraint 

Suppose that a /.p.p. is solved, and then the i-th (original) constraint 
is removed and we wish to know whether the optimum solution we 
already have is still optimum. Denote the two /.p.p.s by 

minimise e’x subjectto Ax=b, x=0 (1) 
and 

minimise ¢’x subjectto Ax =b, x=0, (2) 
where (A,b) is m Xx (n+ 1) and (A,b) is (m— 1) x (n+ 1), and denote 
the optimum solution of (1) by x,. 

We cannot simply remove the i-th constraint from the final tableau 
for (1) because in general all other rows have had a multiple of the 
i-th row added to them. The crucial question is whether the i-th 
constraint of (1) is active for x = xy, because if it is not, then removing 
it will not alter the situation. Thus if (1) is derived from a problem 
with inequality constraints and the i-th slack or surplus variable is 
positive in x, then x, is optimum for (2). For genuine equality constraints 
in (1) all are active, and if they are independent, removing any one 
changes R and we would expect that x, would not still be optimum. 
We can be more precise if we examine the dual in conjunction with 
the equilibrium theorem. Let y, be the optimum solution of the dual 
of (1), then (y,), = 0 implies that x, is optimum for (2) (ER). 


8.4 Introduction of a Further Constraint 
Suppose that a /.p.p. is solved and then a further independent 
constraint is imposed. We use the same notation (1) and (2) for the 
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two Lp.p.s as in the previous section, where now (A,b) is 
(m + 1) X (n + 1) and 


a X, +4442 


denotes the extra constraint. If x, satisfies (3) then it is the optimum 
solution for the new /.p.p. (ER). 

If X76 4,, 41, (Xo), #5,,,, then we may proceed as follows, assuming 
for convenience that x,,x,,.:.,x,, are the basic variables in x,. 

Insert the extra (m + 1)-th row into the optimum tableau and subtract 
4,,,,, % (i-th row) from this (m+ 1)-th row for i = 1,2,...,m to 
Se EE MAES ON SA OL 8 ES 
multiply this row by —1. We now have a /.p.p. in which the columns 
of the (m+ 1)xn matrix of coefficients include m columns of the 
(m + 1) X (m+ 1) unit matrix I,,, which we can solve by the two-part 
simplex method with only one artificial variable in the first part. 


m+i.l xX, TN, s Caan tn = Pax) (3) 


8.5 Variation of b 
We consider only the change in which b is replaced by b = b + de,, 

i.e. 6, = 6, Fe 1:2. k—1,k41,.... 983 b, = b, + 8. Denoting 
the optimum solution for the /.p.p. (1) of section 8.3 (i.e. 5 = 0) 
by x,, then 

Ax, 4b + de, for 50, 
SO X, cannot still be the optimum solution. However, the values of 
the basic variables in x, are given by B''b for the appropriate B'', 
and if B'b is non-negative the basic variables at optimality are 
unchanged and their values are given by B 'b, because the corre- 
sponding e.c.c.s ¢’ are still given by 

“¢’ = c'— &B'A. 
It is easy to find the range of 5 for which the basic variables at 
optimality are unchanged; 
B'b=B'b+5B'e, = b’ + 5b“ say, 
where b’ denotes B'' b, and b“’ denotes the k-th column of B'. 
Thus we require b/ + 5b\” = 0, and so the set of basic variables 
at optimality is unchanged for 
«<b, =b; 
max =6 <= min : 
b)>0 a” oH) <0 5” 


For the situation in which b depends linearly on a parameter A, i.e. 
b(A) = b + Ab we can use the analysis of this section to obtain a sequence 
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of characteristic values of A as in section 8.1, for each of which 
the optimum solution has a particular set of basic variables. Instead 
of doing so, we only observe that the same procedure as that developed 
in section 8.1 can be used if we work with the dual problem instead. 
We also mention, without giving any details, that if a, is replaced 
by a, + 6 then B ' may or may not be changed, depending on whether 
x, is ‘basic or non- ‘basic at optimality. Denoting the new A by A and 
the new B' by B' then the feaneaiy and Oper criteria, 
b’=B'b=0 and c’=c-—é’B'A=0 
can be used to determine the effects of the change (see {9}). 


8.6 Variation of c 

In this section we amplify the remarks at the end of section 8.1 
and analyse the effects of a change in one cost coefficient. 

Suppose c, is changed to c, +6 and the optimum solution of the 
Lp.p. 

minimise e'x subjectto Ax=b, x=0 
is x,. The optimality criterion is 
co = es RA > 07 
so we distinguish the two cases 

(i) x, non-basic at optimality, and 

(ii) x, basic at optimality, so that c, appears in é’. 

In case (i) ¢’B'A is unchanged, SO X, is still an optimum solution 
and the optimum value ¢’x is unchanged provided that the new k-th 
e.c.c. is still non-negative, 

ie. +520 or 52-<c;{, 
where c; is the k-th equivalent cost coefficient at optimality. 

In case (ii), suppose that the k-th column of the final tableau is 
the s-th column of I,,, and consider c/(5), the j-th e.c.c. after the 
change of c, to c, + 5, for j=1,2,...,n 

i p=, c,(d) = c, + 8 — (€" + de/)(B'A),, 

=c, +8 — (€" + Se” Je, 
= c, + 5 — (cxd) = 0 
so c{(6) = 0 for any value of 3. 

i JAR and x, is basic, then ¢/(5) = c, — (t7 + de! )\(B A), 
where (B” 'A),, = e; for #Y S, SO ‘that c (5) =o C= 0 and c/ (6) 
= 0 for any value of 6. 
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If 7 # k and %, is non-basic, then 
c)(5) = c, — ©" + 5e7)(B'A),, 
c, — (¢’B 'A), — 5a’, 
where a‘, is the (s,/)-th element of B'A, and we know that 
c, — (€’B'A), is the j-th e.c.c. c) at optimality. 
Thus é7(5) = c; — 6a’,, and 
5<cj/ai, for a,>O0 or 


é4(5)=0 if ae 
56=c,/a,, for a,<9, 


and for x, still to be the optimum solution we require 

o(6) = 0, jf = 1,2,..., 0. 
Hence for any k, 1<k<n, x, remains the optimum solution when 
c, is changed to c, + 6 for 


, , 


c 
. : . . = 
max 7 <5 < min if x, is basic and (B A),, = e, 
ay<0 ,, ay>0 a, Gnk 1 1S NEN. basic 
or 52 -¢;, if x, is non-basic. 


If x, is non-basic the optimum value ec’ x, is unchanged; if x, is 
basic the optimum value increases by 65x,. 
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Exercises 8 


1. Solve the /.p.p. 
minimise (c¢ + Ad)’x subject to 


oie (aan @ 2 
1-2 4 O}x={2], x=0, 
te. eee 4 


where ¢’ = (-2,-3,0,-1) and d” = (0,1,0,-3). There is only one 
characteristic value; choose A, = 0 (see exercise 6.2). Check your 
result by considering A = 3 and A = | and inserting the corresponding 
¢(A) in the appropriate tableau. 

2. Solve the /.p.p. 

maximise x,+x, subjectto x,,x,20 and 
3x, + 2x, = 6 
to ky 8 I 
2,4 3x, = 6. 
Use the dual problem to determine which constraint may be omitted, 
without changing the optimum solution. Verify your results with 
a diagram. 
3. For the L.p.p. 
minimise (c + Ad)’x subjectto Ax =b, x=0, 
what is the maximum number of characteristic values of A? 

4. For the example of section 3.4 find the range of values of each 
of b,, b,, 6, in turn for which the basic variables at optimality 
are unchanged. 

5. For the example of section 3.4 find the range of values of each 
of c,, C,, ¢, in turn for which the optimum solution is unchanged. 
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NOTES 


CHAPTER 9 


THE SHOR-KHACHIAN ELLIPSOID METHOD 


9.1 

An important development in linear programming theory is the 
method due to N. Z. Shor and L. G. Khachian which leads to a 
polynomial-time algorithm in contrast with the simplex method which 
yields an exponential-time algorithm. 

This comparison is discussed in this section, while some details 
of the method itself are described, and some of its properties estab- 
lished, in section 9.4. In section 9.2 we see how the ellipsoid method, 
which is directly concerned with finding a solution of a system of 
strict inequalities Ax < b, can be used to solve /.p.p.s. The method 
itself, despite its great theoretical interest, is unlikely to reduce the 
dominance of the simplex method as the approach for solving /._p.p.s 
in practice, so we do not discuss its practical implementation. Instead, 
as the method involves constructing a sequence of ellipsoids in n-space, 
this and other aspects of the background linear algebra are discussed 
briefly in section 9.3. 


As we saw in section 2.9, for a .p.p. with n variables, there could 
be as many as n!/(m!(n—m)!) stages in the simplex method. As n 
increases, n! increases like (21n)'/*(n/e)" (this is Stirling’s approxima- 
tion to n!). In a worst possible case, the simplex method could take 
as many stages as this, and so a definite upper bound on the time, 
or amount of work, required will involve the factor (n je)’. 

When the amount of time possibly required by an algorithm involves 
the number of variables as an exponent it is said to be an exponential- 
. time algorithm. In contrast, the amount of time required for the method 
of Gaussian elimination, for example, for solving a system of n linear 
equations in n variables increases with n like n° (see Appendix 3). 
This is an example of a polynomial-time algorithm, where n appears 
in the expression for the time required with a fixed exponent indepen- 
dent of n. Both the expressions (n/e)” and n° increase rapidly with 
n, but (n/e)” increases very much more rapidly; with n = 100 their 
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" respective values are approximately 10'*’ and 10°, and with n = 1000, 
107° and 10°. 

The usual way to compare the amounts of time or amounts of 
work needed by different algorithms to solve the same problem is 
to evaluate 7(n), the number of arithmetic operations (additions and 
multiplications) each needs (see Appendix 3 for an example). 7(n) 
increases so rapidly with n for exponential-time algorithms that they 
soon become impractical for even the fastest computers. This means 
that the development of a polynomial-time algorithm for /.p.p.s. was 

‘a major mathematical goal. However these considerations are some- 
what theoretical and in practice two other aspects are highly relevant. 

The first is that, although the nature of T(n) is of greatest importance, 
the constant or other factors multiplying the dominant term are also 
important. For the ellipsoid method the bound 7(n)=4(n + 1)/L x 
an(m+n-+f) can be established, where an(m+n-+ B) is a bound 
for the number of operations required at each stage and a is small 
(see {15}). (The number of operations required by each stage of the 
simplex method (see section 7.2) is essentially m(n—m).) The number 
L appears frequently in the analysis of the ellipsoid method. It is 
approximately the total number of binary digits in all the non-zero 
coefficients involved in the /.p.p. and can clearly be very large. 

The second aspect is the way in which the amount of time suggested 
by 7T(n) compares with the amount of time actually taken in practice. 
In cases where 7(n) is almost always very pessimistic its practical 
relevance may be slight. This is so for the simplex method, where 
the number of stages rarely exceeds a small multiple of m and can 
be expected to be nothing like exponential in n. For the ellipsoid 
method also, the bound given above may be somewhat pessimistic 
in practice, but the number L is explicitly involved in the algorithm 
(see section 9.4); and because the simplex method performs so 
efficiently (for general /.p.p.s at least) the ellipsoid method is not 
a practical alternative and is unlikely to have the impact one might 
at first expect of a polynomial-time algorithm. 


9.2 

The Shor-Khachian method finds a solution of a system of strict 
linear inequalities Ax < b (if solutions exist), for the case where 
the elements of A and b are integers. The restriction to integer 
coefficients is crucial for establishing finiteness of the algorithm (and 
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hence its polynomial-time property) but, as we shall see in section 
9.4, integer coefficients are not necessary for the operations of the 
algorithm itself. In practice any /.p.p. solved on a computer may 
be regarded as having integer coefficients because all the coefficients 
when stored must have a finite number of binary digits, and so 
multiplication of the (stored) constraints by appropriate powers of 
2 would convert the coefficients to integers but leave the feasible 
region unchanged. Of course if the coefficients involved do not have 
an appropriate finite binary representation then the rounding-off that 
is required is equivalent to a perturbation of the problem; but we 
have seen in chapter 8 (and see also Appendix 3) that the effect 
of such a perturbation could be examined if necessary. 

To write a l.p.p. as a single system of inequalities, without an 
objective function, we make use of the duality theorem. Suppose 
the problem is in standard primal form 


minimise c’x subject to Ax = b, x = 0. (1) 
This problem has a solution if and only if the dual problem 
maximise y'b subject to y’A <¢",y =0 (2) 


has a solution. 
Thus the /.p.p. (1) has a solution if and only if the combined 
inequalities 


(A, 0) (y ) =, A9(F)=6,(F =o (3) 


have a solution for ( - ) A solution ; of (3) does not necessarily 


provide the optimum solutions of (1) and (2) unless we involve the 
condition ¢’x = y’b, which is satisfied by optimum solutions. Since 


c’x — y"b = 0 for any feasible solutions ’ of (3), the requirement 


c’x — y’b < Orestricts us to optimum solutions of (1) and (2). 
Thus the constraints 


(A, 0) (5) = b, (0,47) (¥) = ¢, (y )= 0,c’%x—y’b<0 (4) 


have a solution (= if and only if the /.p.p. (1) has a solution, and 
0 


X» and y, are optimum solutions of (1) and (2) respectively. 
Written as a single system of inequality constraints (4) becomes 


Cerne 
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or 
a oO —-b 
O ys ¢ 
agi () Png ty (5) 
 ingigae wt Wii 0 
"> Sa Say 


\ 


i.e. Ax <b where A is (m+n+1+n+m)x(n+m) 

In general the set of solutions of (5) will be a single point in 
(m + n)-space, whereas the set of solutions R of a feasible system 
of strict inequalities 

A’x <b’ 
is an open set with infinitely many points and a non-zero volume 
V(R). (If A’xg <b’, then A’(x, + Gx)’ <b’ for any 8x sufficiently 
small.) 

That the magnitude of V(R) is meats positive is another aspect 
that is crucial for establishing the polynomial-time property of the 
ellipsoid method, as we explain in section 9.3, and so a given system 
Ax =< b must be replaced by a system of strict inequalities. This 


can be done by perturbing 6 slightly, and it can be proved (see (15}) 
that 


where the elements of A and b are integers, has a solution if and 
only if 


Ax <b +2>‘e, where e = (1,1,...,1)’, 
has a solution’ 
9.3 
The Shor-Khachian method for finding a solution of 
, Ax <b, (1) 


where A is m X n, is sequential, and at the k-th stage we have 
an n-vector x, and an n X n matrix B,. These define an ellipsoid 
E, in n-space with centre x, which contains at least a part, S say, 
of the feasible region of (1). If x, does not satisfy (1) then x,,, and 
B,,, are constructed so that the new ellipsoid E, ,, still contains the 
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whole of S but has volume V(E,,,) which is less than V(E,), the 
volume of E,. We will show that the volumes satisfy V(E,,,)/V(E,) 
<= y < I, where y is a constant. Therefore V (E,) eventually becomes 
less than V(S) if x, never satisfies (1), which contradicts the existence 
of-solutions of (1). We will also show that the ellipsoid E, constructed 
at each stage does have the required properties. Some comments 
which provide background information about ellipsoids and affine 
transformations of n-space may be useful and these are given below. 
They require results from linear algebra that are not needed for the 
theory of the simplex method. These results are stated as they are 
needed, with brief comments but without proof; proofs and further 
explanation can be found in many texts on linear algebra, including 
{1}, {2} and {5}. 
If E denotes the unit sphere with centre the origin, 
E = {x|x’x < 1}, 
then for any non-singular matrix Q 
E, = (Qx|x € E} = {Qx|x’x < 1} 
is an ellipsoid in n-space with centre the origin. Alternatively 
Ey = {x|x’7Q°"Q'x < 1} = {x|x7B-'x =< 1), where oO. 
= B", i.e. B = QQ’, because if x € E, then xx < | and con- 
sidering the vector Qx, 
(Qx)"B ' (Qx) = x’Q’Q""Q"'Qx = x"x< 1. 

Any matrix B of the form Q’Q, where Q is non-singular, is symmetric 
and positive definite (ER). It represents the ellipsoid E,, which is 
the transformation T of the unit sphere E, where T(x) = Qx. Similarly, 
any Symmetric positive-definite matrix B represents an ellipsoid 
{x|x"B>'x < 1} because there exists a non-singular matrix Q such 
that QQ’ = B. The matrix Q can be expressed in terms of the eigenvalues 
A, and eigenvectors y, of B, j = 1,2, ...,m. Because B is symmetric 
and positive-definite all its eigenvalues satisfy 4, > O and it has a 
corresponding set of independent mutually orthogonal eigenvectors 
y,, J = 1,2,...,n, satisfying 

By, = \,y,, yy, = Oif i ¢j and y, y,=1, fori,j = 1,2, nays 
If y, is the j-th column of the n x n matrix Y, then Y’ = Y~', 
and with D the diagonal matrix with d, = \,, we have BY = YD; 


SO B= YDY’ = YD'”D'”y” = QQ’, 
where Q= YD'”’ and D'”? is the diagonal matrix with ae ED gi 
forj = 12, On: 
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Notice that B-' is also symmetric and positive definite, so it 

represents an ellipsoid 
Eo. = {x|x’Bx < 1}, 
with B"' = (Q”) 'Q:' = (YD''”7,(D '’Y’) = YD"'Y’. 

If we now consider the n mutually orthogonal unit vectors e,, 

j = 1,2,...,m, which are the axes of the (spherical) ellipsoid E, then 
Qe, = YD'’’e, = Y(A,;’"e,) = ay’ Ye, = la 

So the effect of the transformation 7, 7(x) = Qx, is to transform 
the axes of E into the mutually orthogonal unit eigenvectors y, of 
B, which are the directions of the axes of E,, and to “‘stretch”’ 
them by the factors eae Thus the volume of pee V(Eg), will be 
Mo Re vic Rae VB eo 

1 ws. Ay = det(B) = det(QQ”) = det(Q)det(Q”) = (det(Q))’, 
sothat — V(E,) = |det(Q)| V(E). 

If a translation by a vector z is added to the linear transformation 
T(x) = Qx we have an invertible affine transformation 7, T(x) = 
Qx + z (invertible because Q ' exists and so the inverse transformation 
exists (see (2) below)). This transformation T maps E onto the ellipsoid 
E., with centre z, 

Eo, = {x(x — z)’B"' (x —2z) <1} where B = QQ’, 
because if x, © E, xg Xp = | and then 
((Qx, + 2) — 2)’B°' (Qx, + 2) — 2) = | 
and so T(x,) € Eg. It is convenient to write T(E) for Ey .. The 
translation of the ellipsoid E, onto Eg, does not affect its volume 
so we have V(E, d= = det(Q) VE). 
The inverse T ‘ of the anstoomation T is iefingt by 


T'(xy)=Q'(x-2=Q'x-Q'z (2) 
because T(T '(x)) = T"' (T(x) =x (ER). 
Notice that if E = {x|x’x = 1}, 
then T(E) = {x|(T~'(x)' T"'(%) s 0); 
and generally, if E,, = {x|(x — z,)’ B,' (x — 2) = 1), (3) 
then =. T(E) = {x|(T~'(x) — 2%)’ B, '(T"' (x) — 2) = 1) (4) 


= (x| — z— Qz)"Q-" B,'Q' (x —z-Qzy)= 1), 6) 
so B, represents the ellipsoid E, which has centre z,, and QB,Q’ 
represents the ellipsoid T(£,,) which has centre z + Qz,. 

Also QB,Q” is symmetric. and positive-definite if B, is symmetric 
and positive-definite, and if S € E, then T(S) € T(E,) (ER). 
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9.4 The Shor-Khachian Algorithm 
At the k-th stage the n-vector x, and the n x n matrix B, define 
an ellipsoid E, with centre x,. If x, does not satisfy Ax < b then 
for some i, | = i = m, a/,x = b,; and, writing a for a,,, we replace 
x, and B, by x, ,, and B, ,, where 
1 B,a 


=x%,- — l 
i in a (a’B,a)'/? ” 
: 2 (B,a)(B,a)” 
B= —( eee), a 
n— 1 n+1- (a’B,a) 


The expressions (1) and (2) are a convenient definition of the algorithm 
but are not the most suitable for practical implementation. Other 
versions slightly improve the efficiency and improve the numerical 
stability (see {16}). The expression (B,a) (B,a)’ is an n x n matrix 
with rank | so that, apart from the factor n’(n* — 1)"', B, , , is obtained 
from B, by a rank-one modification (this is a frequent device in 
non-linear optimisation algorithms). 

Now consider the hyperplane {x|a’(x — x,) = 0}. This contains 
the centre of E, and therefore separates E, into two halves 

E, inwhich a’(x—x,)=0, and 

E,, inwhich a’(x — x,)>0. 
If a’x, > b, then a’x > b, in the whole of E,, so that E, contains 
the whole of the set S contained in E,. The formulae (1) and (2) 
ensure that E,,, contains the whole of E, . The validity of the 
construction (1) and (2) is established in theorem 11: the geometrical 
decrease in volume at each stage and hence convergence of the 
algorithm is established in theorem 12. 

The algorithm begins with £, defined by x, = 0 and B, = 27‘1. 
It can be shown (see {15}) that if Ax < b has any solutions then 
the set of solutions S contained in E, has volume at least 2. “"*"”. 

It is extremely helpful, before proving theorems I 1 and 12, to simplify 
the situation by replacing the general ellipsoid £,, represented by 
x, and B,, and subsequent ellipsoid E,,,, represented by x,,, and 
B,,,, by E’ and E*, represented by x’ and B’, and x* and B* 
respectively, where E’, x’ and B’ have a particularly simple form, 
namely x’ = 0 and B’ = I, so that E£’ is just the unit sphere with 
centre the origin. To achieve this we apply an invertible affine 
transformation T which maps E, onto E’, so that 
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T(E,) = E’, and we write T(E,, ,) = E*. 

We can choose T so that the vector a,,,involved at the k-th stage, 
becomes —e, and then instead of a,,, x,, B,, x,,,, and B,,, the. 
properties of the Shor-Khachian algorithm can be established using 
—e,, 0, I, x*, and B*, where x* = 7(x,,,) and B* is the matrix 
which defines E*. The use of the transformation T is a valid device 
because the three properties we are concerned with, 

B,,, Symmetric and positive-definite, 

E,_ contained in E, ,,, and 

V(E,,,) = vy V(E,) where y < 1, 


are invariant under invertible affine transformations (see section 9.3 
and exercise 9.1). We observe that T does not need to preserve the 
value of a’x,, because a merely defines a hyperplane perpendicular 
to a and containing the centre of E, and hence the separation of 
E, into E,_ and E,, 

Substituting 0, —e, and I in (1) and (2) for x, a and B, gives 


x* = e,, (3) 
n+ 1 
: 2 

and B* =— (1- eet] (4) 

n— 1 n+1 
which is a diagonal matrix with diagonal elements 
1 2 n? n? n- 
ae an oe en ae ae oe 


To verify that the transformation T which maps E, onto E’ also 
maps E,,, onto the ellipsoid E* defined by x* and B* of (3) and 
(4) we identify the inverse transformation T~'. Writing T~' = Qx + z, 
z is clearly x, and we see how Q can be found. Assuming B, is 
symmetric and positive-definite (see theorem 11) then we know there 
is a non- singular matrix, Q, say, such that Q,Q/ = B,. Now let 
a = ((Qia)"(Q)a))'”? denote the length | of Qj a, so that a’ = a 'Q’a 
is a unit vector in the direction of Qja. An orthogonal matrix, Q, 
say, can be found such that 

Qe, =a’. (5) 
The matrix Q, represents the rotation of n-space which maps —e, 
onto a’, and we observe that 
a = ((Q/ Qa)’ (Qi Qo a))'”? = (a"B, a)'”* 
because Q,Q! = I. The matrices Q, and Q, are not unique and can 
be constructed in various ways. Their construction is not discussed 
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as Tis being used purely as a convenient device for theoretical purposes 
and does not appear in the algorithm itself. 

The required transformation T can now be defined by 

rad 3) = QQ, xX + x, 

or T(x) = Q7Qo' (x — x) = QTQz' x — Q7Qo'x,. (6) 

Using the formulae (3) and (4) we see that, with T@ given by 
(6) and E, = {x|(x — x,)B,'(x — x,)}, T(E,) = (x|x’x = 1) (ER). 
Verifying “that T(E,,,) = E* is straightforward but worthwhile (see 
exercise 9.2). 

We can now prove that the construction (1) and (2) for the 
Shor-Khachian algorithm is valid. 


Theorem 11 
(i) If B, iis symmetric and positive-definite then B,,, given by (2) 
is symmetric and positive-definite. 
(ii) The whole of E,_ is contained in E,, .— 
Both these results can be verified in terms of E’, E*, x’ = 0, 
B’ = I, and x* and B* given by (3) and (4). 
From (4), B* is diagonal and therefore symmetric; its diagonal 
elements are all positive so it is positive-definite. 
As the vector which defines E’ and E’, is —e,, 
= ({x|x’x < 1, -e7x < 0} 
=/{xix"x = 1,0< xs 1). (7) 


l 
Let x € E’, then with x* = ( Je and 
n+ 1 


(n+1f wW-1 w-1 n’—1 
B* ' = dgl taal ; se Se 3 


2 
n n n n 


(x — x*)"B*"' (x — x*) = x7B*'x — 2x7B*-'x* 4 #7 Bey 


2 2 2 
n—1 (n + 1) nf 
4 cet ( = )x 


n “ah n° 
(n + 1) (( 1 ) 2x, ) 
+ = ~ 
n n+l n+l 
n—1 | n’—-1l 1 
= 7G s- D4 2 eee = 
(n + 1) 


2(x; ae (8) 
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2(n + 1) 


=0+ bpieesce— x (xy lyse A; 
n 


forn=2s 
The expression (8) enables us to describe the ellipsoid E* geometri- 
cally, because x is on the boundary of E* when 
(x = :2*)'BY x — x*) = 1. 
Therefore points x such that x’x = | and x, = 0 are on the bound- 
ary of E* and these points are also points in the intersection of 
E' and the hyperplane {x| —e;x = 0} through the centre of E’ 
perpendicular to —e,. A further isolated point, x, = 1, xx = 1, 
is common to both boundaries and so E’ and E* are tangential there. 
In 2-space, or in a plane section through the origin and the x, 
axis, the situation is described by the diagram (9). 


u(x|x"(—e,) = 0} 


(9) 
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In general E,,, intersects E, in the (n — 1)-dimensional ellipsoid 
in which the hyperplane {x|a” (x — x,) = 0} intersects E,, and also 
intersects E,, tangentially, at T-'(e,). The point 


x 1 
Tr ‘(e,) = Q,Q,e, +x, = —- | Bee +t, 


is not (in general) in the plane of the diagram (10), which is the 
plane defined by x, and a. 
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To establish the convergence of the ellipsoid algorithm within a 
specified number of stages we prove that the volumes of the ellipsoids 
generated decrease at least geometrically. 


Theorem 12 
The volume of E, , , satisfies 
V(Ex.) <7 V(E,) (11) 
where y < | and y is independent of kg 
It is sufficient to prove that V(E*) < yV(E’) for y < 1, and 
as B* = QQ’ we can establish (11) by evaluating |det(Q)|. 


(n +1) \n’ -1 
so that Jog|det(Q)| = log n — log(n + 1) 
n— 1 
2 
= 4(t(n) — t(n + 1)) <0, 
where t(n) = nlog n — (n — 1) log(n — 1) and ¢(n) increases as 
n increases (ER). Therefore |det(Q)| = y < 1, where y is independent 
of ky 

From (11) we have V(E,) = y‘V(E,), so that if the algorithm does 
not terminate with a feasible x, satisfying Ax, < b, eventually V(E,), 
for k = 4(n + 1)L, becomes less than V(S), the volume of the 
set of solutions which are contained in E, if Ax < b has any solutions. 
This is a contradiction and so if it happens then Ax < b has no 
solutions. 

The method of ellipsoids has drawbacks which, for general /.p.p.s, 
make the simplex method much superior for practical purposes. Firstly, 
if a/.p.p. is infeasible, to reveal this the prescribed, very large, number 
of stages is required. Secondly, and more importantly, the number 
y is extremely close to 1, (more so as 7 increases) so that the ellipsoids 
generated may decrease in volume very slowly. In practice this is 
usually the case and the ellipsoids and their centroids that occur 
show no regular behaviour that could be used to predict the eventual 
outcome. 


From (4), (det(Q)) = det(B*) = —"—, (Oy 


+ (log n’ — log(n — 1) — log(n + 1)) 
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Exercises 9 


EE: 


a 


Explain how a non-negative solution of a system of inequalities 
Ax = b can be found (see exercise 4.4). 

Let E, be the ellipsoid {x|(x — x,)’B,' (x — x,) = 1} and E,, 
and E, the two halves of E, defined by the hyperplane 
H = {x\a’(x — x,) = 0}. If T is the affine transformation which 
maps E, onto E’, the unit sphere with centre the origin, verify 
that T(1) defines two halves of E’ which are T(E,,) and T(E, ). 


. Using the definition (6) in section 9.4 verify that the affine 


transformation T which maps E, onto E’ also maps E,,, onto 
E*. 


. For the case n = 2 obtain x* and B* and verify that diagram 


(9) describes this case. How does this diagram change as n increases? 


. Evaluate the amount of work (e.g. the number of multiplications) 


in one stage of the ellipsoid algorithm, taking into account the 
structure of the matrix involved (see (5) section 9.2), and compare 
it with the amount of work in one stage of the simplex method. 


. Does the choice of a,, (see section 9.4) affect the number of stages 


needed in the ellipsoid method? Discuss the practical implications. 


. A “‘deeper cut” of the current ellipsoid E, than that through the 


centroid x, and defined by the hyperplane a/, .(X — x,) = O can 
be made so that E,_ is less than half of E,. What is the best 
alternative hyperplane based on the violated constraint a we oe 
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NOTES 


CHAPTER 10 


TRANSPORTATION AND SIMILAR PROBLEMS 


10.1 

The matrix of coefficients A of a transportation problem is very 
sparse and so such a problem would be a natural candidate for solution 
by the revised simplex method. However, as we saw in section 1.3 
A is more than just sparse: there is a pronounced special structure 
of non-zero elements, all of which have value 1, and the structure 
is exactly the same for all transportation problems. This results in 
even more efficient algorithms in which the initial data is retained 
unchanged throughout. These algorithms are particularly interesting 
because although they can be defined without reference to the simplex 
method, they really consist of the simplex method performed implicitly, 
and also because the duality theorem and the dual problem play a 
crucial part. 

The problem is to choose the amounts x, Of some commodity to 
be transported from each of m sources D,, D,,...,D,, to each of n 
destinations B,, B,,...,B, so that the total cost Bn tet Cy Fy 18 
minimised. 

For i = 1,2,...,m the total amount taken from D,, ¥”_, x,, cannot 
exceed the amount d, which is available there, and for j= 1,2,...,n 
the total amount taken to B, Sap should not be less than the 
amount b, required there. 

As we observed in section 1.3, if 


2214, = F_, 5, (1) 
then we have the /.p.p. 
minimise 7" ,%)_,¢, x, subject to 
x, 20, U_,x,=d, U7 ,x, = b, (2) 
Pmt, 2. ...,18) Jel, 2,000) 2 

In practice we are unlikely to have exact equality in (1) and so there 
will have to be a source with some surplus or a destination whose 
requirements are not met. In order to produce a /.p.p. in canonical 
form, if 2/",d, < %_,b, we introduce a fictitious source ts tate 
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containing d,,,, = 2 _,b, — ",d, of the commodity, and if 
x7, 4, > ~_,b, we introduce a fictitious destination B,,, requiring 
b,4,=2;_,4,— 2;_,b, of the commodity. In either case the corre- 
sponding fictitious transportation cost coefficients are all zero. We 
shall assume that this modification of the problem has already been 
made if necessary, so from now on we have 27" ,d,= &_,b, and the 
Lp.p. 
minimise c’x subjectto x= and Ax=b, where (3) 

GF Bias cos Cig Garett ky ng Oa ee 

PEE OE GREE RE SO AE ARE 

B28 (dis bir an aee Ora tarcite Ox), cand 
A is the (m +n) x (mn) matrix described in section 1.3. 


Since the sum of the first m rows of A and the sum of the last 
n rows of A are the same, A has rank less than (m+n) (ER), and 
in fact r(A) = (m+n -— 1) (see exercise 10.4). 
The dual problem is 
maximise y'’b subjectto y’A<c’, (4) 
where y is an (m+ n )-vector. It is helpful to write y as 


u 
y=(¥) 
where u is an m-vector and y is an n-vector. 
Then (4) can be written 


maximise Xj" ,u,d,+ X/_,v,b, subjectto u,+v,< cy 

beh Reduis Mate UpBes.cel- (5) 

As there are only (m+ n— 1) independent primal equality constraints, 

and omitting any one of the (m+n) equality constraints gives an 

independent set, we really only need (m+n— 1) dual variables. We 

could for example omit the first primal equality constraint and omit 

u,. Instead we retain both, to preserve the symmetry of the problems, 

but we have only (m + n— 1) basic variables in a b.f.s. of the primal, 
and consequently we always set one dual variable to zero. 


x x 


mi? “m2? °** omen A 


10.2 

A method for solving transportation problems is developed by 
solving, in an intuitive fashion, a particular example. Specific parts 
of the method that emerges are discussed in more detail in the following 
sections. 
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Let m=3, n=4; d,=4, d,=4, d,=8; b, =3, b, =6, b, =4, b,=4, 
and the cost coefficients c, be given by the cost matrix C, 


13.43 
G=at.4.3-3-4 
245 4 (1) 


In this example 2,d,< %,b,, so we introduce a fourth source D, 
containing | unit of the commodity, d,=1,c,, = 0, jf = 1, 2, 3, 4, 
and from now on m = 4, n = 4. 

The values x, for any chosen solution x themselves constitute an 
m X n matrix X. The sum of the rows of X must be b and the 
sum of the columns must be d. This leads to a simple method for 
finding an initial bf.s., called the northwest corner method. 


(2) 


Starting with x,,, the northwest corner element of X, we put 
x,, = min (b,, d,), which is b, = 3 in this case. This means that 
all other elements of the first column of X must be zero, so this 
column can be removed, and the remaining row sum of the first row 
of the remaining part of X is 4—3 = |. This principle is now repeated 
with the remaining parts of X, b and d. Thus x,, = min (6, 1)=1, 
X2, = min (5,4) = 4 and so on. Each step determines the remaining 
elements of one row or one column of X, except the very last choice 
Xm, Which completes the m-th row and the n-th column of X, so 
that in general (m+ n— 1) elements of X, i.e., of x, will be assigned 
a non-zero value. 

The same method for the case b= (4,5, 4, 4) and 
d = (4, 4, 8, 1) gives 
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(3) 


and one of the zero x, must be chosen as a basic variable. We will 
adopt the convention that, in this situation, x,, or x,, is chosen; it 
does not matter which, and the one with the smaller cost coefficient 
would be a natural choice. 

The value of this solution (2) (i.e. the cost of this particular 
transportation scheme) is 

e7x=2X34+3143%444x145%444x3=57. 

By the equilibrium theorem for canonical form, we know that if the 
b.f.s. we have just obtained is optimum, then the dual constraints 
corresponding to basic variables are satisfied as equalities (see exercises 
5.8 and 6.8). Using this result to determine the vectors u and v gives 
the seven equations: 


u,+v,=2, u,+v,=3, u,t+v, =3, uyt+v, =4, 


u,+v,=5, u4,+v,=4, u,t+v, = 9. (4) 
Imposing the additional equation u,=0 determines u and v; v, =2, 
v,=3, u,=0, u,=1, v,=4, v,=3, u,=—3, and this computation 


can conveniently be performed in a compact tableau similar to and 
using (2). 


(5) 


Suppose we now evaluated u,+ v, for x, non-basic and found that 
utv,=sc, i= 1,2,....m, j= 1,2, ...,. Then u and v would satisfy 
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the dual constraints (4) or (5) of section 10.1, and 
(u’, "(b) — "x = (u’, v’)Ax — ex 
= ((u’,v’)A —c’)x. 
But for x, # 0, ((u’,v")A —c’), = u,+v,—c, = 0, so that x and 
(u’, v’) are optimum solutions for the primal ita dual /.p.p.s respec- 
tively, by the duality theorem. 

If u and v do not satisfy all the dual constraints we attempt to 
find an improved b.f.s. of the primal by asking what would happen 
with the corresponding b.f.s. in the simplex method. The essential 
information we need is the vector of e.c.c.s ¢e’, which we eye 
by adding to c multiples of rows of A so that ci, is zero if Ry 
basic. In the present situation, we do not have an equivalent or 
of primal constraints 

A’x = (> ) 


in which the columns of A’ corresponding to basic variables x, are 
columns of the unit matrix, so the appropriate TOW multipliers . are 
not just —c,. However the vector c’ given by 

7 =c"— (u’,v")A 
satisfies the conditions c/ = 0 if x, is basic. 
So, for (ij) such " that x, is non-basic, we _ evaluate 
((u’ ",V' YA), =u,+v,, and these values can be put in the empty cells 
of the tableau (5). Notice that these u,+ v, are just the w,, of section 
5.4 so it is natural to call them W,,. The current situation can be 
described unambiguously by a single tableau if each cell contains 


“y ; ; , “y ' ; . 
Xy | if x, is basic, or i a if x,, is non-basic. 
ee had | 


) (6) 
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In the simplex method we introduce a positive amount 0 of the non-basic 
variable x, corresponding to the negative e.c.c. of largest magnitude, 
where @ is given the largest possible value consistent with maintaining 
feasibility while changing the other basic variable to still satisfy the 
equality constraints. 

In the present situation the negative e.c.c.s are c},, C44, ¢4, and 
C4;- The largest in magnitude of these is c/, (or cj,), so we put 
X,, = 9. To preserve the row and column sums we have successively 
to replace x,, by x,,—0, x,, by x,,+0, and x,, by x,,—6. From 
these replacement values we see that the maximum value for @ is 
4, which leads to the following b-./.s. 


(7) 


As there are only six positive basic variables, we retain one of x,, 
and x,, as a basic variable with value zero, and since x,, has the 
smaller cost coefficient this is the natural choice. 

The whole procedure can now be repeated with a single compact 
tableau as (6), which is constructed by inserting in order 


6,4, ¢,, P= 1,2,.,.05%, f= Blyth, 

x, for basic variables, 

ti, ¥, = 1,2, ...,m, Jf = 1,2, 20, 

w,, for x,, non-basic, 

V where cj, = c, — w, = 0, 

+6 starting in (s, ¢) cell where min Cy = Ch 
Before doing so, we check the value of the b.f.s. just obtained. This 
is6+3+8+ 20+ 12 =49 =57+ 0X minc;,. 
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Value=6+3+8+4 20+ 12=49, 


min ci, = ¢, = —2, 


6= 0. 
c,,0 = 0. (8) 


Value =6+3+8+ 20+ 12 = 49, 


min Cj, = C3, = Cy, = —1, choosing 
c4, because c,, < c,, leads to 

@=1. (9) 
c,,0 = —1. 


Value =6+3+6+1+4 20+ 12 = 48, 
min ci = cy, = —1, 
6=3. 
cy,0 = —3. 


(10) 
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Value = 12+6+14+6+4+8+4 12 = 45, 
all cj, = 0, 


hence current b.f.s. is optimum. 


(11) 


The dual variables u,v satisfy the dual constraints u,+v, = c,,, 
i= 1,2,...,m, 7 = 1,2,...,m, and, as it should be, the value of the 
dual solution u’d + v’b is 


8+ Bin 3 404+ 16 eS: 


In this solution it is the third destination which receives less than 
its stated requirement. 


10.3 

Since the method which evolved in the previous section was just 
the simplex method we know it will reach the optimum solution in 
a finite number of steps. However, it is essential in the simplex method 
that the columns of A corresponding to the basic variables are 
independent, so we must check that this is the case initially and 
that the @-circuit preserves this property. Three other pertinent aspects 
are discussed in the following section. These are: 

(i) whether the northwest corner method provides a good initial b./.s., 
(ii) whether x,,, necessarily has integer valued elements when b and 
d have integer valued elements, and 
(iii) the question of cycling when b.f.s.s are degenerate. 

To show that the columns of A corresponding to the basic variables 
determined by the northwest corner solution are independent we just 
emphasise the convention of section 10.2, that when the remaining 
row sum and the remaining column sum are the same, for example 
b,=d,, or b,>d, and b,—d, = d,, we will reduce the remaining 
part of X by removing its first row or column (it does not matter 
which, and in these instances the next basic variable chosen will 

3 
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have value zero). Thus in every case the number of basic variables 
determined is (m+n-— 1), there is (at least) one row or column of 
X with only one basic variable and, crucially, if we remove this row 
or column, then the remaining part of X has the same property (ER). 
Denoting the corresponding columns of A by an(m + n) X (m+n— 1) 
matrix B, the northwest corner method has produced a solution of 


Bx = (5); (1) 


where x is the (m+n-— 1)-vector of basic variables. Since we have 
a solution, : is in the column space of B and so either B has 


full rank, i.e. r(B) = (m+n-—1), or the solution X is not unique. 
But x is unique, because, returning to the equivalent situation in 
which row sums and column sums of X are equal to elements of 
d and b, as there must be at least one row or column of X with 
only one basic x,, the value of this x, is uniquely determined. If 
we now remove this row or column of X exactly the same argument 
holds for the remaining (m+n-—1) rows and columns of X and 
(m + n — 2) basic variables. Hence, inductively, all the x, are uniquely 


determined, so that Bx = : has a unique solution and B has full 


rank. We could regard the northwest corner method as identifying 
a set of (m+ n— 1) basic variables, whose values x are then determined 
by (1). From the way in which the method chooses the basic variables, 
we can see that each can have only one value if they are to satisfy 
()). 

To show that successive bf.s.s do correspond to independent 
columns of A we examine the procedure of the 0-circuit. This consists 
of alternate steps along rows and columns of X and must involve 
only rows and columns of X with at least two basic variables, the 
only possible exceptions being the first and last steps from the new 
variable, x,, say, that has just been selected. The complete circuit 
defines a closed path and identifies a number of the current basic 
variables together with x,,. The columns of A corresponding to the 
variables defining any such path are linearly dependent (see exercise 
10.7). When the value of @ has been chosen, and x,, is given this 
value, one of the basic variables on the circuit, x,,,, say, has value 
zero and all of them have a unique value. This follows because if 
d’ denotes d with d,, replaced by d,,— 0, and b’ denotes b with 


b,, replaced by b,.— 6, then Bx = (*-) still has a unique solution 
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(with x, =0). Thus if B’ denotes B with the column corresponding 
to x,.,, replaced by the column of A corresponding to x,,, and x’ 


denotes X with x,,. replaced by x, then B’x’ = (3 has a unique 


so? 


solution (with x,,=6) and so the columns of B’ are independent. 
(Alternatively, see exercise 10.7.) 
We observe at this point that the equations which determine u 


and y are 
u 
B’(") ='t); 


where c, is an (m+n -— 1)-vector of cost coefficients, and since the 
rows of B’ are independent there is always a solution of these equations 
for u and vy (see exercise 10.7). 

For small ‘‘academic’’ examples a suitable circuit of basic variables 
can be found by inspection. In practice, for larger problems, a 
systematic search procedure is needed and one way to organise this 
is suggested by the technique used in the following two chapters 
where network flows are discussed. 

The method developed in section 10.2 for solving transportation 
problems is sometimes called the stepping-stone method. 


10.4 

(i) The northwest corner solution is not necessarily the best initial 
bf.s. to use, and any other b.f.s. with a lower cost would be 
preferable, although it would not necessarily result in fewer stages 
to obtain the optimum solution. One alternative which usually 
gives an improved initial b.f.s. (but sometimes a worse one!) 
is the matrix minimum method. Here, instead of starting with 
the northwest corner element of C, we start with c,, where 


Siten setae | 
Jel 2yscss n 


This determines x,, and effectively reduces the problem by one 
row or column. 

A compromise between this and the northwest corner method, 
to save repeatedly searching the whole matrix, is to choose the 
variable x,, corresponding to the minimum cost coefficient in each 
row (or column) in turn instead of the northwest corner coefficient. 
For the example of section 10.2 two of these approaches yield 
the initial b.f.s.s below, where the integers in the bottom left-hand 
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(ii) If b and d have integer elements then x 


(iii) 


corners indicate the order in which the basic variables are 
determined. 


matrix minimum 
method 
cost = 58 


successive row 


minimum method 
cost = 48 


op: Hecessarily has integer 
elements, which is an important result for practical purposes (see 
also section 10.5). This follows directly from the method developed 
in section 10.2, which nowhere involves division. Notice however, 
that the method does not require that b and d have integer elements. 
The crucial aspect of transportation problems is that although 
feasible solutions x need not have integer elements when b and 
d have integer elements (e.g. x, = 6,d,/%,b,), when this is the 
case any b/f.s. must have integer elements (see exercise 10.6). 

The possibility of cycling (see section 4.7) would appear to be 
more serious for transportation problems, partly because b.f.s.s 
often have several basic variables with value zero, and partly 
because when all the numbers involved are integers the arithmetic 
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operations will be performed exactly. Nevertheless, transportation 
type problems which cycle are not expected to occur in practice 
so that the perturbation technique which follows is of mainly 
academic interest. 

As with the basic simplex method, to prevent cycling it is 
sufficient to prevent degeneracy, and to prevent degeneracy it 
is sufficient to prevent ties between remaining column sums and 
remaining row sums. This can be done by replacing d, by d,+«, 
i= 1,2,...,m, and b, by b, + me, for some positive but sufficiently 
small e. As in section 4.7 a specific value need not be chosen 
for e, and we just use the principle to decide which variables 
x, are basic with value zero. A different approach to transportation 
problems which uses graph theory may be found in {10}. 


10.5 Assignment Problems 

The transportation problem can be regarded as a problem in assigning 
the amounts of the commodity at each source to go to each destination, 
with a specific penalty, the cost, for each source and destination 
pair. The corresponding situation in which there is a benefit instead 
of a penalty clearly leads to a /.p.p. in which A has the same structure 
and a typical situation is that of personnel assignment. We shall 
distinguish three assignment problems: 
(i) the simple assignment problem, 
(ii) the optimum assignment problem, 
(iii) the categorised optimum assignment problem. 


The first two of these, although they are of transportation type, 
are even more specialised and may be solved by the special methods 
developed in chapter 12. 

In the categorised optimum assignment problem, we may consider 
the situation of n categories of job with 6,,b,,...,5, vacancies 
respectively, and m categories of applicant with d,,d,,...,d,, persons 
respectively. For each category of applicant and each category of 
job, there is a rating which gives a numerical measure of the applicants’ 
suitability for the jobs, and the problem is to decide how many persons 
from each category to assign to the various jobs so that the sum 
of the assignment ratings is maximised. 
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We may list the ratings "% in a rating matrix R. Then with X, as 
the number of applicants in the i-th category assigned to the fs -th 
job, the .p.p. may be written 


maximise 2%", > r,X, subject to (1) 

Bias Xy = Oy Vay eys Gd, xy =i0, jal, he Ee ee ae a 

There aout antiaalle be the additional requirement has x, is an 

integer, but we know this will be the case for the optimum sdlihtion. 
ee is exactly the form of the transportation problem. If 

2), 5, # Xi. ,d, we introduce a fictitious category of person or job, 


and if we put c,=—r,, i = 1,2,...,m, j=1,2,...,n, then the Lp.p. 
is exactly that of (2) or (3) of section 10.1. 


Example 

Three categories of applicant with 5, 8, 4 persons respectively, 
apply for five types of job with 4, 2, 1, 7, 3 vacancies respectively. 
The rating matrix is 


2. 21:5 Sig 
om hi 2t 3) ft 
soe 2} 2 


Find the assignment which maximises the sum of the assigned ratings. 
For this example we find the initial b.f.s. not by the northwest corner 
method but by choosing minimum column elements. Thus x,, = 4, 
X3, = 2, X,,;=1, x,,=0, x,,=7, x,,=2, x,,=1, giving an initial cost 
of —59, that is a rating of 59. 


Value = —(12+5+21+1 
+ 16 + 4), 
i.e. rating = 59, 


min c, = c}, = c}, = —3, 
choosing cj, because c,, < c,, 
leads to 6 = 2. 

c3,9 = —6. 
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Value = —(6 +5 + 16 + 15 
+3+4+4 16), 
i.e. rating = 65, 


min ci, = ¢,, = —3, 
6 =2. 
c3,9 = —2. 


Value = —(5 + 324+2+9 
+3+4+ 16), 
i.e. rating = Tl, 


min ci, = Cj, = —2, 
C= 1. 
€3,0 = —2. 


Value = —-(40+24+2+6 


+344 +4 16), 
i.e. rating = 73, 
all cj, = 0, 


hence current b.f.s. is 
optimum. 


The dual variables u, v satisfy the constraints, and the value of the 
dual solution is 

40 + 16 — 24 — 24 — 7 — 56 — 18 = -—73. 
Thus the maximum possible overall rating is 73, and an assignment 
that gives this rating is: 2 persons from group 2 and 2 from group 
3 do job 1, 2 persons from group 3 do job 2, | person from group 
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2 does job 3, 5 persons from group | and 2 from group 2 do job 
4, and 3 persons from group 2 do job 5. 

Instead of converting ratings to costs and minimising, we could 
maximise the sum of the assigned ratings directly, in which case 
the optimality criterion would be CdS Oy heb 2 scone Ay Aone t) 1: 
it is a matter of personal opinion which is least confusing. 


10.6 The Caterer’s Problem 

The transportation problem appears in a number of situations. The 
trans-shipment problem is one (see exercise 10.14), and the contract 
award problem (see exercise 1.4) is another. A particularly ingenious 
application involves a caterer who requires clean table-cloths for 
dinner-parties on successive days. The table-cloths can be purchased 
for c,, cleaned overnight for c, or cleaned over a period of p days 
for c, (i.e. used on day j and ready again on day (j/+p+1), where 
¢,>¢,>c,. Assuming that any number of table-cloths may be pur- 
chased and any number cleaned by either laundry service on any 
day (and assuming that they are unerringly soiled by the diners), 
how should the caterer arrange for the daily supply of clean table-cloths 
so that the cost of providing them is minimised? 

Suppose there are n dinner-parties, one on each of n successive 


days, the j-th one requiring b, table-cloths, j = 1,2, ...,n; these are 
the destinations. The sources are the supplier, whom we assume has 
d,,,=/_, 6, available for purchase, and the n baskets of soiled 


table-cloths at the end of each party, the i-th one containing d,, where 
d,=b,, i=1,2,...,n. It is convenient to introduce an aftermath des- 
tination which requires b,,, table-cloths, where 6,,,=d,,,. If 


we denote by x, the number of table-cloths used on the j-th day 
from the i-th source, then the constraints are 
“1 Xy = 6, (= total number used on j-th day), j= 1,2,....n 41, 
and >" "! x, = d,(= 6, = total number used from i-th source onall days, 
plus the number going to the final destination 
from the i-th source), i = 1,2,...,n + 1, 
and x, = 0, i,j = 1, 2,....,.041. 
The cost coefficients are: 
0, j=nt+l1, i=1,2,...,0 
€;, i=n+l, j=1,2,...,n 
Cy =) C2 (= 1,2,...,.n-1, j=i+ 1, i+2,..., min (i+ p,n) 
e,, 1=1,2,....0=p-—¥ f= t+p + Won 
NG ee Ae ee 2 ae (ER) 
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Exercises 10 


‘. 


Interpret the dual of the transportation problem in canonical primal 
form, in terms of a haulage company who offer to buy the 
commodity at the sources where it is manufactured and sell it 
back to the manufacturer at the destinations. 


. Solve the transportation problem in which m = 3, n = 5, 
d’ = (4, 5, 6), b” = (2, 2, 3, 4, 4) and 
a, ee ae oe 
Cusine B47 
44 2 sh 


. Obtain a sufficient condition for the optimum solution of the 


transportation problem to be unique (see exercise 3.6). Obtain 
a different optimum transportation scheme for the examples of 
exercise 10.2 and of section 10.2. 


. Prove that the matrix A of a transportation problem has rank 


(m+n-—1). 


. For m = 3, n = 5 say, and supposing that the first row of the 


matrix A of a transportation problem is removed, choose any 
7 (=m-+n-— 1) independent columns and show that they can be 
rearranged by row and column interchanges to form an upper 
triangular matrix with unit diagonal. What is the implication of 
this result? (The result holds in general.) 


. A companion exercise to 10.5 which requires a knowledge of 


determinants: Prove that all minors of A (determinants of square 
submatrices of A) have value —1, 0, or +1. Hence explain why 
the inverse of any non-singular (m + n — 1) X (m+n — 1) submatrix 
of A has only integer elements, and hence why, if d,, b, are all 
integers, then the optimum solution of a transportation problem 
has only integer elements. 


. Let z,,z,,...,Z,, be the columns of a transportation matrix A 


corresponding to the variables x, in a @-circuit. Show that the 
vectors Z,,Z,,...,Z>, are linearly dependent and that =%*, a,z, 
= 0 with each a,= +1. Explain why any (2k — 1) of the vectors 
Z,,Z>, ...) Zy, are linearly independent. Hence prove that at every 
stage of the method of section 10.2 the columns of A correspond- 
ing to basic variables are linearly independent. 


. Solve the example of section 10.2 starting with the initial b/s. 


obtained by choosing (i) the matrix minimum method, and (ii) 
the successive column minimum method. 
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Do 


Show that during the solution of a transportation problem the 
vectors u, V, x at every stage satisfy 


ZyCy Xy = X,u,d, + X,v,b,. 


©) Explain whether, instead of putting u,=0 at each stage, any 


10. 


12. 


element of u or v can be given an arbitrary value. 

Solve the categorised optimum assignment problem in which five 
categories of person with 5, 6, 3, 1, 3 persons in each category 
respectively, apply for three categories of job with 7,6,4 vacancies 
respectively and the rating matrix is 


a 2 
4 
] 
2 
3) 2 


wounds ~s 


Ny wD — 


. A personnel officer, having solved a categorised assignment 


problem, decides to revise the rating matrix R. The new ratings 
F, are given by 7, = ar, + B,, for some constant aand constants 
B,.Bo, ‘engines aviaa an  AGclaat way (!) of obtaining the revised 
optimum solution. 

Solve the categorised optimum assignment problem of section | 
10.5 starting with the initial b.f.s. given by 

(i) the northwest corner method, 
(ii) the matrix minimum method, and 
(iii) the successive row minimum method. 


. Solve the caterer’s problem in which there are four dinner parties 


on successive days requiring respectively 20, 27, 38, 28 table-cloths 
which cost 9 units to buy, 4 units to clean overnight or 2 units 
to clean by the day after next. 


. One version of the trans-shipment problem is the transportation 


problem in which there are intermediate junctions where loads 
of the commodity can be divided and reassembled and which 
have a maximum capacity. Ignoring any costs arising from the 
redistribution, and assuming each part-route has a transportation 
cost per unit of commodity, express the trans-shipment problem 
as a L.p.p. 


- Prove that in the optimum solution x of a transportation type 


problem at least one variable x, is equal to b, or d,. 
COaly ATUe When etna Solumon is AMV - ) 
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NOTES 


CHAPTER 11. 


NETWORK FLOWS 


11.1 

The solution of a transportation problem can be thought of as defi ining 
a flow of the commodity from the set of sources to the set of destinations 
along the routes connecting them. A time period was never mentioned 
in connection with transportation problems and so the resulting solution 
can refer to a single transportation task, an annual programme, or 
a weekly, or a daily one. Alternatively, all amounts of commodity 
can be interpreted as rates of flow, so that for example, b, = 2b, 
means that whatever quantity of the commodity is delivered to the 
second destination, twice that quantity is delivered to the first, and 
so on. If we imagine a more complicated network of routes connecting 
sources and destinations, with intermediate junctions, and instead 
of a unit cost for each part of each route there is a maximum capacity, 
then the problem of determining the maximum possible flow is clearly 
a l.p.p. (see exercise 11.1). Instead of developing a special version 
of the simplex method to solve such problems we develop an indepen- 
dent method. We restrict our attention to networks with a single 
source s and a single destination s’, which in this context is called 
a sink, but see exercise 11.5. The method, or algorithm, we develop 
can be used to solve network flow problems with integer or with 
arbitrary capacities but as we shall use it in chapter 12 for assignment 
problems we will only consider problems in which the capacities are 
integers. 

The points of a network, the source(s), sink(s) and intermediate 
points, are called nodes and the connecting routes are called edges. 
The nodes are denoted by x,,x,,...,x,, an edge by (x,,x,) and the 
whole set of nodes by N. A capacity function k assigns to each 
edge of N a non-negative integer k(x,,x,) which is the maximum flow 
from x, to x, that the edge (x,,x,) Can support. Capacities may be 
symmetric (k(x,,x,) = k(x,,x,)) or unsymmetric (k(x,,x,) A k(x,,x,)) 
and k(x,,x,) = 0. A capacitated network (N,k) is a network N together 
with the associated capacity function k. 
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A flow in a capacitated network (N,k) is a function which assigns 
to each edge (x,,x,) a number f(x,,x,) and which satisfies 

I (%;,%;) BS ~f(%,»%,)s (1) 

and f(x,,x,) = k(x,,x,). (2) 

A flow function is essentially just a list of flows, and will be integer 

valued as far as we are concerned. Notice that (1) implies that f(x,,x,) 

= 0 and introduces the convention of nett flows. If there is a flow 

a from x, to x, and 6 from x, to x, this is the same as a flow 

a — B from x, to x,. 

For the two functions k and f defined on a network N we shall 

use the notation k(A,B) and f(A, B) to denote 


Zea K(x,,x,) and X, — , f(x,,x,), where A and B are any subsets of N. 
xEB x,EB 


' The properties (1) and (2) imply that 
f(A,A)=0 and f(A,B)< k(A,B). 
Also, for any distinct subsets A and B of N, and any subset C of N, 
f(A U B,C) =f(A,C) + f(B,C), 
and f(C,A U B)=f(C,A) +/f(C,B) (3) 
and the same is true for k. 

We formally define a source s and a sink s’ for a flow f in a 

network by saying a node s is a source for f if 

S(s,N)>0O and f(s,x,)=0, x,€ N; 
and a node s’ is a sink for / if 

S(N,s')>0 and f(x,,s')= 0, x,E N. 
The second condition in both cases is to avoid any possible complica- 
tions with unproductive circular flows, for example from s to x, to 
xX, to x, to s, and from now on N will be used to denote all the 
nodes x,,x,,...,x, together with s and s’. 

We can now define the problem as follows: given a capacitated 
network (N,k) with a single source s and a single sink s’, find a 
flow f whose value f(s,N ) is a maximum. 

For any (finite) network a maximum flow exists, and its value is 
at most k(s,N) (ER). 


11.2 
To develop a method for finding a maximum flow we need another 
concept, that of a cut in a network. 
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A cut (S,S’) in a capacitated network (N,k) with a single source 
sand a single sink s’ is a division of the nodes of N into two disjoint 
subsets S and S’ which satisfy SUS’ = N, sES, s’' ES’. 

The capacity of a cut is defined as k(S,S’), and a cut which has 
the minimum possible value is called a minimum cut. 

The connection between flows and cuts begins with the following 
observation: if (S,S’) is any cut in a capacitated network (N,k) with 
a single source s and a single sink s’, and f is any flow, then the 
value of the flow f(s,N) is at most the capacity of the cut k(S,S’). 
If f(s,N) = k(S,S’) then f is a maximum flow and (S,S’) is a minimum 
cul, 

This result is easily established using (1) (2) and (3) of section 11.1. 
As f(x,,N).= 0 for x, #4 s’,s, 

S(s,N) = f(s,N) + f(X,,N), wherex,€ X, if x,E-S and x,# s, 
= f(S,N) =f(S,S U S’) 
= f(S,S) + f(S,S’) = f(S,S') s k(S,S"). 
If a particular flow f, and a particular cut (S,,S/) satisfy f,(s,N) 
= k(S,,Sj) then f(s,N) = k(S,,Sj) for any flow, and hence /, is 
a maximum flow, and similarly (S,,S/) is a minimum cut. 

The correspondence between cuts and flows and primal and dual 

l.p.p.s is already apparent and is emphasised by the next theorem. 


Theorem 13. The Maximum Flow-Minimum Cut Theorem 

For any capacitated network with a single source and a single sink 
the value of a maximum flow is equal to the value of a minimum 
culg 


Let f be a maximum flow. We Say an edge (x,,x,) is saturated 

by / if 
F(X) = k(%%,). 

A path is a sequence of edges of N connecting distinct nodes of 
N, and an unsaturated path isa path all of whose edges are unsaturated. 

Thus a path from s to x, can be denoted by 

Ped, K,5 c53j.%,)- 

The edges of P are (s,x, ), (x,,.%;,)5 -..» (%,,.%,), and if P is unsaturated 
then for any edge (x,,x,) of P 


FG;.x) < k(x,,x;). 
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Now we define sets S and S’ by saying s€ S, and x,E S if x,E N 
and there is an unsaturated path from s to x,; x, S’ if x,€ N and 
x,€ S. 

To show that (S,S’) is a cut we need to show that s’ ¢ S. So suppose 
that s’ € S; then there is an unsaturated path P from s to s’. Let 


6= min (k(x,,x; doa Seay 


(4.x) € P 
and define a flow f by 
S(%;,x;) = Ff (x,,x,) for (x,,x,)@ P and 
I (%,,%;) = Ff (x,,x,) +6 for’ (x)54,)'E: 
From the definition of P, 5 > 0 and the flow / satisfies 
I (x,,x,) = k(x,,x,) for all edges of N, 
but the value of the flow / is 


S(s,N) =f(s,N) + 6, 
which contradicts the flow f being maximum. Thus s’ € S’ and (S,S’) 
is a cut. 

Now we already know that 

f(s,N) = f(S,S’) = k(S,S’). 
Hence, if f(s,N) <k(S,S ’) then f(x,,x, )<k(x,,x,) for some edge (x,,x,) 
with x,E S and x,€S’, and so the unsaturated path from s to x, 
can be extended io x, which contradicts the definition of S and S’. 
Thus f(s,N) = k(S,S’)— 

Theorem 13 is clearly the counterpart for networks of the duality 
theorem for /.p.p.s, and like the duality theorem, it gives a means 
of testing whether a flow is maximum. It also suggests a method 
of constructing a maximum flow by listing, for any current flow, 
the set of nodes that can be reached from s by unsaturated paths. 
If this set contains s’, we can improve the current flow and repeat; 
if this set does not contain s’ we can verify that the flow is a maximum 
flow using the cut defined by the set. A systematic way of implementing 
. this method is described by an example. 


11.3 
Find a maximum flow in the network: 
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2 
® 


where the edges have the symmetric capacities indicated. 

The network can be described by its capacity matrix K, where 
k, denotes the capacity of the edge (x,,x,) and where entries are 
made only of elements which have a corresponding edge in the network. 
The initial capacity matrix, referring to the network with no flow 
defined, we denote by K,. 


Ss 8 1 

rr © 4 2 

2 4 3 2 
Ko= 431 4 3 ] 

4 i 4 

5 2 | 3 5 

6 ] 3 

s’ 4 3 


We can see by inspection that a flow J, can be imposed consisting 
of 


2 units from s to x, to x, to s’, 
1 unit from s to x, to x, to s’, 
| unit from s to x, to x, tos’, 


4 units from s to x, to x, to x, to s’. 
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This flow can be described by a flow matrix F,, where (F,),, denotes 
the flow from x, to x,. 


Ss I | 
1 4 2 
2 5 

= 13 | 
4 2 
5 -5 5 
6 -1 1 
s’ 2 -5 -l 


Since the edges of N have symmetric capacities, K, is symmetric. 
The matrix F, is skew-symmetric. The zero elements of F, and the 
elements for which there is no corresponding edge have been omitted. 

The flow of | unit from s to x, means, for example, that the edge 
(s,x,) now has capacity 4 — | = 3 units and the edge (x,,s) now 
has capacity 4—(—1) = 5 units. Overall, the capacity of N with 
the flow f, is given by K, = K(/,) = K, — F,. 

! 


Ss 2 Go 3.3 

1 14 0 0 z 

2 8 3 2 
K,= 3 3 0 

4 l 2 

5 2 8 | 2. oe 

6 ee oe 3 2 

x 6.1 10, ,4 


Using K, we now search for an unsaturated path from s to s’. The 
edge from x, to x, is unsaturated if (K,),,>0, so searching the s-row 
of K, we find unsaturated edges (s,x,), (s,x,), which we can denote 
by 
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x, 
‘=> 
Xe 


Now searching the x,-row of K, we find that the edges (x,,s) and 
(x,,x,) are unsaturated, but as s has appeared before in this search, 
the edge (x,,s) can be ignored. Thus we have 

x,— {x,, 
and the same procedure for x, gives 

x, {x,, 
where the edge (x,,s) has been ignored because s has already appeared. 

Combining both stages gives 


x, (x, 
s> (1) 

x, > {x,, 

and continuing with the x, and x,-rows of K, gives 
X;—> 
X6 

2 
ones. (2) 


where X denotes that no further progress can be made since all 
unsaturated edges from x,, (x,,5), (%,,X,), (%,,x;), (%,,,), lead to nodes 
which already appear in the tree (1) and (2). 

The whole tree so far is 


x4 
x, 7 (x, 
X6 
ot (3) 
X,— (x, > X 


The x,-row of K, gives 
X,— {s’, 
so we have an unsaturated path P,, 
Bye {8 5\1%s, 84:4’). 
The search procedure finds systematically an unsaturated path from 
s to s’ if there is one. It does not find all unsaturated paths nor 
the unsaturated path with greatest capacity. We would find a different 
path if we considered x, in the tree (1) before considering X,, Or 
if we considered x, in the tree (2) before considering x,. 
The minimum capacity of edges on the path P, is 
min {(K,).,, (K,),5» (K,)s4, (K, gs} = min {2,2,4,2) = 1. 
Thus the flow 4/, described by 5F, can be added to f,, 
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SF, 


aA nA ff} WC KH — a 


a 
~ 


We can now describe the current capacity of N with f, by 
K, = K, — 5F, = K, — F,. 


ana nf WC NY — GF 


~ 


ua 
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The search tree for K, is 
x, (x, {x 8’ 
sS-> 
xX, — {x,— {X 
and a further flow of | unit 
from sto x, to x, to x, to s’ 


can be added to f,. To save writing out a further stage we can observe 
here that there is still another unsaturated path, 


from sto x, to x, to x, to s’, 


which can accomodate a flow of | unit (the capacity of (x,,s’) has 
just been reduced to | unit). These two additional flows constitute 


5f,. 


o 
Le 5 | 
iS) 
II 
RO nA & WY WY — & 


ped 


-2 


hidwih seo 
4 2 
-4 =I 1 
F, = 1 1 
iy -1 3 
Me I 1 
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Ss 0 0 

1 16 0 0 0 

2 2! 78 4 0 
K,=K,-5F,= 3 2 

K,-F,= 4 4 2 

? 4 10 0 

6 3 (ri 4 

x , a 
The search tree for K, is 

x, > {X 
5s (tno {OO aa (4) 


which shows that f, = f, + 5f, is a maximum flow. 
A minimum cut is given (from (4)) by 
St (9; 2), Ses Bay Xap Reh OE SA. 
The value of f, is the sum of the elements in the first row of F,, 
which is 
8+1+2=I11. 
The capacity of the cut which is indicated by the dotted line on 
the diagram on page 145 is the sum of the capacities of edges which 
cross the cut and is 


2+14+5+3=I11. 


11.4 

The method of the previous section is rather tedious for small 
networks, particularly those which can easily be described in a 
two-dimensional diagram and which can usually be solved by inspec- 
tion, using the principles of the method. but without writing out the 
capacity and flow matrices K and F. The procedure at each stage 
involves a more exhaustive and less precisely defined search than 
that needed at each stage of the simplex method, and this aspect 
is typical of algorithms for integer linear programming problems. 

To implement the method in practice one has various options, most 
of which make an insignificant difference to the efficiency. The most 
natural approach is to store (row-wise) only the elements of K and 
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F corresponding to edges that are present in N, and to store (K),, 
and (F),, in consecutive locations of a one-dimensional array. The 
essential point is that both K and F are sparse, often very sparse, 
and the elements which may be non-zero are defined by the network 
and do not change during the algorithm. It is more convenient to 
store K, and overwrite at each stage, but it makes no difference 
whether we also store K, or F,. 

A comprehensive treatment of network problems is given in {11}. 
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Exercises 11 

1. For a capacitated network (N,k) with a single source s and a single 
sink s’ write the problem of finding a maximum flow as a Lp.p. 
in standard (dual) form. 

2. Find a maximum flow in the network below, where the capacities 
indicated are symmetric. 


Find by inspection an alternative maximum flow and the corre- 
sponding minimum cut. 

3. Find a maximum flow and a minimum cut for the network below, 
where the capacities are as indicated by the arrows. 


(1) -<& 6. 


4. Denoting the s- and s’-rows and columns of capacity and flow 
matrices by the suffices 0 and n+ | respectively, show that 
(i) for any flow matrix F, and any i, i = 0,1,...,"4 1, 
yi-0 (F,),,, = 9, and 
(ii) for any capacity matrix K, and any i, i = 0,1,...,2+ 1, 
Lino (K,),5, a Dis (K,),.,,;- 
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5. Suppose a capacitated network (N,k) has f sources s,, S,,..., 5,, 
t’ sinks s{, 53,...,8,, and n other nodes x,,x,,...,x,, and it is 
required to find a maximum flow from the set of sources to the 
set of sinks. Devise a capacitated network with a single source 
s and a single sink s’ and (n+¢+¢’) other nodes whose maximum 


flow (or flows) provides a solution to the given problem. 
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NOTES 


CHAPTER 12 


ASSIGNMENT PROBLEMS: THE MARRIAGE 
PROBLEM 


12.1 

In this chapter we use an interesting and elegant network to solve 
the simple assignment problem and the optimum assignment problem 
mentioned in chapter 10. We describe each problem in terms of one 
particular situation, but many other situations clearly lead to the same 
mathematical problems. The essential feature is the requirement to 
match or assign the members of one set to the members of another, 
either so that some common criterion is satisfied or so that some 
quantitative measure of the success of the matching is optimised. 


The Simple Assignment Problem 4 

Suppose there are m individuals /,,J,,...,1,, to be assigned to n 
jobs J,,J,,...,J,. Each individual may only be assigned to those jobs 
for which he or she is qualified. The problem is to assign as many 
individuals as possible. 

The situation may be described by the mx n qualification matrix 
Q in which 

q, = 1 if I, is qualified for J,, and 
q, = 9 if J, is not qualified for J. 

The values 0 and | are unimportant; they are used here just as two 
different symbols. The problem may now be restated as: 

given a qualification matrix Q find as many distinct 1’s as 

possible such that no two of them are in the same row or column. 

If m#n we can introduce fictitious persons or jobs with no 
qualifications, i.e. rows or columns of Q with all entries zero. It 
is convenient to assume that this has been done so that m=n, but 
it is not necessary in practice and the method to be developed does 
not require it. 

Assuming that the jobs are desired by the individuals and are not 
some form of punishment, then we have a convenient and benevolent 
view of the’ situation if we try to find every individual a job he 
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or she is qualified for. Clearly a necessary condition for all individuals 
to be assigned is that any set of p of the individuals must between 
them be qualified for at least p distinct jobs. 

For example, with m = n = 4 and 


a ae ae 
CuTLeWee 

Sach Ge ie AS Gar x. () 
le ae ae 


every individual is qualified for at least one job and for every job 
there is a qualified individual, but not all individuals can be assigned; 
I, I, I, are together qualified for only J, and J,. 

What is far less obvious is the converse result, that if, for 
p=1,2,...,n, any p individuals are together qualified for at least 
p distinct jobs, then all n individuals may be assigned. This is the 
central result for the simple assignment problem, and we prove it 
using an assignment network. 

An assignment network (N,k) has (2n +2) nodes, one each for 
I,,1,,..51,, J\,J5,....J, together with a source s and a sink s’. Its 
capacity function k is defined as follows: 

k(s,f,) =1, i= 1,2,...,a, (2) 

k(J,,s') = 1, j= 1,2,...,0, 

k(1,,J,) = K if I, is qualified for J,, 

where K is some large integer which need not be specified; 

all other capacities are zero. 
The maximum flow is n (or m if we had mn and m<n) because 
k(s,N) = n, and if f is any flow in (N,k) we define an assignment by 
saying J, is assigned to J, if f(/,,J,) = 1. If a flow has value n, then 
all n individuals are assigned. Since f(s,N) = n = f(N,s’) and k(J,,J,) 
= 0, the n unit flows from s to /,, J,, ..., 7, must continue to distinct 
job nodes. 


12.2. Theorem 14 

If, for p=1,2,...,m, any p individuals are together qualified for 
at least p distinct jobs then all n individuals may be assignedm 

We prove that if all nm individuals cannot be assigned then there 
must be a set of p individuals together qualified for less than p distinct 
jobs. So, suppose any maximum flow f in the assignment network 
has value less than n, and let (S,S’) be a minimum cut. We may 
assume w./.o.g. that 


§12.3 ASSIGNMENT PROBLEMS: THE MARRIAGE PROBLEM 157 


yy Maen ae a Rey 2 t 


"2 Pp? 


Bt Ne ns cn ed ae ee aed 


‘ a 


For isp and j>t, J, is not qualified for J,, i.e. kU,,J,) = 0 for 
i= p andj > 1, because if /, is qualified for J, then 


k(S,S’) = k(I,,J,) = K >n2=f(S,S’) (2) 
and the same is true for i > p and J = ¢. 
Therefore 
n> k(S,S’) = %,_, k(s,1,) + 2., kV,,8’) = (n—p) +t. (3) 
Hence p>t, and we have p individuals /,, J,, ..., 1, qualified for 


fewer than p jobs, J,, J,, ..., J,= 

Thus to solve a simple assignment problem we find a maximum 
flow in the assignment network. If the flow has value n all individuals 
are assigned; if this cannot be done, the minimum cut provides a 
set of individuals and jobs which proves that it cannot be done and 
which indicates the number of individuals that cannot be assigned. 

The assignment network has a particular form which enables us 
to avoid the (2m + 2) x (2n + 2) capacity and flow matrices of chapter 
11, and to use instead just the assignment matrix Q. 


12.3 

We describe the method for finding a maximum flow in an assignment 
network using a simple problem with m=n=4 and then examine 
a less trivial problem in the next section. 


4 
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Suppose the qualification matrix is 


i. A (1) 


An obvious solution is 
ftoJ;, Fw J3,.2, sah, I, to J,. 
It is clearly worthwhile starting with a good initial assignment, i.e. 
an initial flow in the assignment network with a high value (see exercise 
12.8). For convenience we consider only the simplest way to obtain 
an initial assignment. This is to assign /J,, i = 1,2,...,m, to the first 
of J,,J,,...,J,, for which he or she is qualified ae which has not 
already been assigned. This assignment can be denoted easily by 
replacing the appropriate 1’s in the assignment matrix by ~1 (remember 
that the 1’s, and now the -I’s, have no numerical value; they are 
just convenient symbols). 
For the example above, this initial assignment gives 


hy ’ (2) 


A new capacity function corresponding to this flow would have 
k(s,I,) = k(s,J,) = k(s,1,) = 0 

k(I,,J,) = k(1,,J,) = k(,,J4) = K —1, which is effectively still K, 
k(J,,1,) = k(J,,1,) = k(V4./,) = 1, instead of 0, 

k(J,,s’) = kV,,5') = kJ,,5') = 0, 

and all other capacities unchanged. 

In seeking an unsaturated path from s to s’, we can go from s 
to any unassigned individual, /, in this case, then to any J, for which 
this individual is qualified, then to s’ if the job is unassigned or 
to J, if J, has been assigned to J,, and so on. In terms of the current 
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matrix Q this means we can find an unsaturated path from I, to 
any J, where q, = | and from J, to that J, for which 4, = —|; in other 
words along a row of Q to any | and then along the column to 
the -1. 
From (2) we find the unsaturated path 
so>L-J,>-1,7J5,>s'. (3) 

Adding this flow to the flow with value 3 which led to (2) gives 
a flow with value 4 and thus all 4 individuals are assigned. It necessitates 
the following changes to the current Q: 

44, from | to -1, 

q,, from -1 to 1, 

q,, from | to -1, 
to give 


which indicates the (final) assignment of all individuals. 
The assignment network with the initial flow indicated is 


and the dashed path is that of (3). 
A slightly more complicated example has the qualification matrix 
and initial flow given by 
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and the unsaturated path search given by 


12.4 


J (> 1, 
so {l,-> 


J,7-1L- (J, s’. 


§12.4 


(6) 


A more interesting simple assignment problem is provided by the 
qualification and initial flow matrix 


Psa a thes om oe — 


or ot ae 


— 


3 


The unsaturated path search gives 


J,—>I,— {X 

I,—> 4J,—7 1,— {X 
s—> J,71L7 V7 s’ 
Io I, : 


and the new qualification and flow matrix (2) 


(1) 
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The unsaturated path search for (2) gives 

J, 1, {Xx 

s— {1,,—> 1 J, 1, > (J, > 1, > (X (3) 

1 (Ip ham {kX . 
Thus there are no unsaturated paths, a maximum flow has value 9. 
at most nine individuals can be assigned, and so for some P, 
1=p=10, there must be a set of p individuals together qualified 
for p — | (= p—(10—9)) jobs. 
The minimum cut given by (3) is 


Riley hy Lael dyed giI,3953 I5s Is Tg) 


and we confirm that the six individuals /,, J,, J,, I;, 1, I,, are qualified 
for only the five jobs J,, J,, J,, J,, J,o- 


12.5 The Optimum Assignment Problem, also known as the Marriage 
Problem 

As we observed in chapter 10, this is a degenerate form of the 

¢ategorised optimum assignment problem in which each category 

contains only one individual or job. As—with-the-simple—assignment 


; On—-purpeses—_to—assume— mn +, 


as-follows: As with the transportahon probles Where 132 ensureck 
thot Zoi = ZA; by introducing a Ficticvous Source or destnchon 
WEA 2em code we Fcients, here we Con Casily Arrange That mn 

by wWtreducing Feticiwus PESOAS OF JOA as ARKeSSASY wh 2ee 
ratings. Jt is” convenient for Ascussion PpUCFOKS to” assume 

thet the has atreeclhy bean dome So that main. Then the 
POCO may be efirud as Fouows: 
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n individuals /,, I,, ..., J, apply for n jobs J,, J,, ..., J, 
and the i-th individual’s rating for the j-th job is r,; 

find the assignment scheme which maximises the sum 

of the assigned ratings. 
This is the same as finding n elements of the nxn rating matrix 
R such that exactly one is in each row and column of R and their 
sum is maximised, which is itself the same as finding that permutation 
matrix P (see section 3.7) which maximises the trace of PR (or of 
RP), (trace (A) = 27, a,,). 

Both are the same as the /.p.p. 


maximise 27 ,_,1,,X, subject to (1) 
arty eth beh 2, i Zr “21, J@= hess 
and x,, = O. Gy 1,2)... 

The method of chapter 10 for solving transportation problems can 
clearly be used and will result in an integer solution since d,, 
i=1,2,...,m, and b,, j = 1,2,...,n, of (3) section 10.2 here all have 
the value |. However, all b.f.s.s will have exactly n non-zero basic 
variables and exactly n zero basic variables, and although the procedure 
of section 10.4(iii) to avoid cycling can be used without difficulty, 
we are still likely to obtain 6 = 0 frequently, and each such stage 
produces no definite progress towards the optimum assignment. For 
the case in which the ratings are integers an interesting alternative 
method, which combines the duality theorem with the method of 
section 11.3 for the simple assignment problem, is developed in this 
and the following section. Before this development begins we mention 
a piquant interpretation of the optimum assignment problem, the 
marriage problem. Here, acommunity of n men and n women, members 
of a pioneering colony perhaps, decide that the future happiness of 
the community (and, no doubt, its present tranquility) would best 
be assured by abandoning the traditional haphazard and competitive 
process of courtship in favour of an orderly and fair assignment. 
Accordingly, each woman expresses the desirability of each of the 
men as a marriage partner by choosing for each of them a numerical 
rating (integer) and it is agreed that the assignment which maximises 
the sum of the assigned ratings will provide the basis for a comprehen- 
sive ceremony. As the example in section 12.7 demonstrates, ‘‘total 
antipathy’’ can easily be taken into account. 

Instead of converting the /.p.p. (1) to canonical form and using 
results from chapters 5 and 10 we shall give an independent proof 
of the duality theorem, which leads directly to a computational 
algorithm. In this instance, we will treat the problem directly as a 
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maximisation problem so that the ratings can be left as positive integers 
and a minus sign used to denote an assignment as in section 12.3. 
The /.p.p. (1) may be written 
maximise r’x subjectto Ax =e, x=0, (2) 
where A is the 2” Xn’ transportation matrix and e is a 2n-vector 
with e,=1, i = 1,2,...,2m. We will refer to the /.p.p. (2) as the - 
primal; its dual is 
minimise Xj_,u,+ X/_,v, subjectio u,+v,=r,, 
i,j =1,2,...,” (ER). (3) 
The objective functions of both problems are bounded above and 
below respectively (ER) and both problems have feasible solutions, 
so both problems have optimum solutions. The duality theorem is 
therefore simplified, but has the additional complication of integer 
requirements. We shall state it in terms of the /.p.p.s (2) and (3). 


12.6 Theorem 15. The Duality Theorem for the Integer Optimum 
Assignment Problem 
The maximum value of f(x) = 
subject to 
Fini X= 1, f= 1,2,....0, Biypzy=d, t= apie and 
x, = Oorl,, j= 1,25... 
is the same as the minimum value of g(u,v) = 2"_,u, + > aan 
tou, + v, =r, and u,, v, integers, i, j = 1,2,...,19 
The converse result, that f(x) = g(u,v) for feasible x, u, v implies 
maces is te established directly. For any feasible x, u, v 
2 1g (MU, + Vi)Xy = 2, UX, + ZV, Xy 
see 2, Xy+ baits By, Dy ps 
so that max f(x) S min g(u,v). 
To establish the theorem itself, suppose uand y satisfy the constraints 
of the /.p.p. (3) and define a qualification matrix Q by 


2 &,t¥,=1f,, 
Gy = 
Otay > 7 


rE load OY SPY 

For the simple assignment an bieet defined by Q either (i) all J, can 

be assigned, or (ii) not all J, can be assigned, and we examine these 

two possibilities in turn. 

(i) Perform the assignment and put x, = | if J, is assigned to J,, 
and x, = 0 otherwise, i,j = 1,2,...,n 


yiy-1 TyX,» where the r, are integers, 


subject 
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Then 

2 TyXy = 2, jauch SF 21.) such (u, + v= Liat u, + Bat Vi» 

that x;,40 that x40 
because for each j there is exactly one x, # 0, and for each 
i there is exactly one x,, 4 0, and hence the assignment is optimum. 
(ii) In this case, we know from theorem 14 that there must be a 

subset, P say, of p of the individuals J, who are together qualified 
for a subset, T say, of ¢ of the jobs J,, where 1 < p. 

Define new values u’, v’ for the dual variables by 


A ‘ae if I,EP 
eae, 


2 uo Ler. 
furs if JET 
v= 
00RWp. ife Ip T. 


The new values of the dual variables satisfy the dual constraints 
u; + v; = r,, because : 
if 1,€ P and J,ET then 

MV, = By Vt Lis et Shy, 
if J,¢ P and J ¢T then 

up +v,=u,+v,=r,, 
if 1,@P and J,€ T then 

uv tv=u,tv,+1>u,+v,2=7F,. 
If J, € P and J, ¢ T then J, is not qualified for J 


|, SO q, = 0 and 


u,+v, > r,. Since u,, v,, r, are integers u, + vps bh Bor, and 
ui +v, =u,—1l+v,2>F,,. 
, ’ outs 
Thus uj + v,=u,—l+v,2r, for i, j= 1,2,...,n. However 


Zu, + Z,v; = (Zu) —pt (ZV) +t<Z,u,+ Z,v,, 

because p > 1, and this contradicts the optimality of u, vm 
The proof of theorem 15 provides a method for solving the marriage 
problem. An initial feasible solution for the dual is easy to find, 


for example 


Then if all individuals in the simple assignment problem corresponding 
to this dual solution can be assigned, the assignment solves the marriage 
problem, and if not the sets P and T lead to the improved dual solution 
a¢. 
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12.7 Example 
Solve the marriage problem with m = n = 5 and rating matrix 


12-919 38 
SRC a 
Sus 10 Je 9 . (1) 


Ft @ § 2 


where z denotes antipathy, i.e. an unacceptable assignment. 

An initial dual solution is given by vim OL P= 12, 5, 
u, = 12, u, = 9, u, = Il, uy = 6, us = 12, and we solve the simple 
assignment problem with qualification matrix Q, 


10000 
00001 

Q=}/00010 (2) 
10000 
00001 


We can see at once that all 5 individuals are together qualified for 
only J,, J,, J, so we can decrease u,, u,, u,, u,, u, and increase 
v,, V4, ¥;- This is the simplest way to improve the dual solution 
when not all jobs have a qualified applicant. 

Denoting those r, for which u,+ v, = r, by *r, we now have 


and again, J, has no qualified individuals so the u, and v, indicated 
by | and f are decreased and increased respectively. In this case 
a decrease/increase of | produces no new *r, So we can do the 
next stage as well if we decrease /increase by 2. 
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This gives 
J, ds cs 
*9 10 3 
—*6 2 z 
*§ —-—*10 *11 (4) 
"3 4 | 
1 10 9 
Of zZ 3 


Now all J, have a qualified individual, so to solve the simple assignment 
problem we assign initially J, to J,, I, to J,, I, to J,, I, to J; and 
search for an unsaturated path from s to J, and hence to s’. This 
initial flow has been indicated in (4) by inserting a minus sign before 
Watines Fy, Moss Fess Ves: 


Thus P = {/,, 1, I,, I,} and T = {(J,, J,, J,}, so we decrease u,, 
u,, U,, U,, and increase v,, v,, Vv, as indicated in (4). 


(5) 


Notice that in (4) u, + v,=r,,, but in (5) u,+v,>~r,,, so that qualifi- 
cations are not necessarily maintained from one stage to the next. 
The initial assignment in (5) is indicated by minus signs, and the 
unsaturated path search gives 
J,> 1,7 {XxX 
s> (1,7 4J, 71, (J, > I. 


J,> 1,7 (I, s' 
This extra flow in the assignment network means that all individuals 
are assigned, 
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I,toJ,, LtoJ,, I,toJ,, I,toJ,, I, to J,. (6) 
The sum of the assigned ratings is 
12+6+ 11+4+ 12 = 45, 
and the value of the dual solution is 
(8+5+8+2+ 8)+(4+14+243+44)= 45, 
confirming that the assignment (6) is optimum. 


12.8 

We should not leave the problems of this and the previous chapter 
without some further comments. The assignment problems we have 
discussed are examples of combinatorial or set-covering problems 
(sometimes called zero-one problems). For such problems a variety 
of methods has been devised, each of which is more or less efficient 
depending on the structure of the particular problem in question (see, 
for example, {9}, {10}, {11}). In chapters 11 and 12 we have presented 
methods for network flow, simple assignment and optimum assignment 
problems partly for the intrinsic appeal of the problems themselves, 
and partly for the interesting way in which the methods develop from 
each other. 

A common feature both of the methods developed here and of 
methods for integer /.p.p.s in general is the need for repeated extensive 
searches through stored information, and these searches can be very 
time-consuming. This feature distinguishes the situation from that 
of solving general /.p.p.s by the simplex method. The optimism that 
has been expressed earlier about the efficiency of the simplex method 
in practice cannot always be carried over to these more specialised 
problems; in some instances the exponential-time quality of the 
algorithms is experienced in practice. There is not a contradiction 
here because the restriction to integer values is qualitatively rather 
different from the general half-space constraints. Notice also that 
even small network or assignment problems have many constraints 
and variables and give rise to quite large /.p.p.s (see exercise HER); 
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Exercises 12 
1. Solve the simple assignment problem whose qualification matrix 


is 
01-0 4.4.58 
Or 1 0. 4) 060.8 
bnbod ef 10) bas@uap 
OL i8 ee 
00.8 1 s-e8 3 
CD *@ "gee nee 


(i) by inspection (constructively), 
(ii) by the method of section 12.3. 

2. Solve the simple assignment problem whose qualification matrix 
is given by 


i 
Sg a 


N 


Pca 6 Ber a) a re, hee 


ou 


j= 


iS 


3. (i) The size and shape of the tree that describes the search for 
an unsaturated path is not known before the search takes place. 
Discuss how the search procedure could be implemented and 
stored automatically by a computer. 

(ii) From the point of view of automatic computation, why is it 
advantageous to denote assignments (flows in the assignment 
network) by minus signs, and how could the equality dual 
constraints *r,, be denoted? 
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4. Solve the optimum assignment problem in which the rating matrix 
is 


a £2 4 4. 88 
en ee Lay ae a 
7-7 6.5 2 8 3 
RS ae A a 
a eS Po BB 
See 2.5 a 
ees ae 
Pas Fee 
5. Solve the marriage problem in which the rating matrix is 
Foe oe web es Se eS 
foe .s 88 2 t £238 
wes 2 2 4 F € 1S ; 
Pe eS ae eee 
rw 2s £6 42 TE CC, 
ote FS a SFG 
Seite FS) 8 Ire 
SS 6 2 612 1 2°33 
S22 22 st Ft eR 1 
ot woe £€ 72 6 3 


where z denotes total antipathy. 

6. The marriage problem was described from a female point of view 
in section 12.5. Suppose the men involved wish their opinions 
(ratings) to be taken into account as well (not instead). Suggest 
two distinct ways in which this could be done. (One way is perhaps 
rather unrealistic, but is more realistic for the applicants and jobs 
Situation.) 

7. In the optimum assignment problem prove that in an optimum 
assignment at least one individual is assigned to the job he or 
she is best qualified (highest rated) for. Deduce another similar 
result. 

Hint: assume a convenient form for the assignment, consider the 
cases n = 2, 3, 4 and obtain the general result by induction. 

8. Devise an improvement to the method of section 12.3 for obtaining 
an initial assignment for solving the simple assignment problem. 
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NOTES 


CHAPTER 13 


GAME THEORY: TWO-PERSON MATRIX GAMES 


13.1 

For a certain class of games the problem of determining the best 
stratagem for playing the game can be formulated as a /.p.p. These 
are {wo-person zero-sum matrix games. 

A simple example enjoyed by children is the stone-paper-scissors 
game. Here, the two players X and Y simultaneously shout one of 
the words stone, paper, or scissors; if both shouts are the same, 
the result is a tie, otherwise stone beats scissors, scissors beats paper 
and paper beats stone. The game is played many times and the winner 
each time receives a fixed predetermined ‘‘reward’’. In general the 
essential features of a two-person zero-sum matrix game are 

(i) the two players compete against each other with no external 
influences, 

(ii) each play of the game consists of both players choosing indepen- 
dently one of a finite number of possible alternatives, 

(iii) the consequence of any pair of choices is fixed and known in 
advance, and 

(iv) the winner’s gain is the loser’s loss. 

Such a game is completely described by an m xX n matrix A, the 

payoff matrix, in which the element a, is the payoff, i.e. X’s gain 

and Y’s loss, when X chooses the j-th of X’s n possible alternatives 

and Y chooses the i-th of Y’s m possible alternatives. 

For the stone-paper-scissors game the payoff matrix is: 


X plays 
stones paper _ scissors 
Stone 0 
Y plays paper —1 =A, (1) 
Scissors I 


where the loser each time pays the winner one point. If we regard 
A as defining the game from X’s point of view, then —A defines 
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the game from Y’s point of view. Thus a game defined by a 
skew-symmetric payoff matrix, A’ = —A, is the same for both players 
and we would expect such a game to be fair (see exercise 13.2). 

The definitive properties of a two-person zero-sum matrix game 
probably apply precisely only in genuine games, but by categorising 
the possible alternative actions and assessing the payoffs, a number 
of economic, management and military situations can be modelled 
and analysed as matrix games (e.g. see exercise 13.8). 

The results concerning matrix games were first established by von 
Neumann in 1928, long before lp. theory was developed. They are 
easily obtained when the connection with /.p. has been made and 
we shall confine ourselves to this approach, which was first established 
by Dantzig in 1951. 

If either player can predict the other’s next choice then, because 
the payoffs are known and each play of the game is a separate 
independent event, that player will use that information to his or 
her advantage. For example, in the stone-paper-scissors game, if X 
knows that Y’s next play will be stone then X will play paper. For 
this reason, both players must make each successive choice randomly. 
However, within this restriction they can both decide the proportion 
of times they choose each of their possible alternatives. So if x,, 
j = 1,2,...,n, is the probability that X plays X’s j-th alternative, 
then X’s problem is to choose that strategy vector x such that X’s 
average gain is a maximum; and Y’s problem is to choose a strategy 
vector y = (y,,),...,¥,,)’ such that Y’s average loss is a minimum. 

If X chose the strategy vector (5, 4, +)’, for the stone-paper-scissors 
game, then X would play: 

stone, on average once every two plays, 
paper, on average once every four plays, and 
scissors, on average once every four plays. 


This is not X’s optimum stratagem, because if Y played paper every 
time then y = (0,1,0)’ and X’s average gain would be —4, whereas 
if X chose (4, 4,4)” (which is X’s optimum strategem) then whatever 
Y played, X would expect to win, lose or draw equally often, so 
X’s average gain would be zero. These examples of stratagems make 
it clear that we need to be precise about what we mean by optimum 
stratagems for X and Y. We said that (4, 4, +)” was optimum for 
X because with this stratagem X can expect to break even. With 
any stratagem for which some x, > +, Y can choose a stratagem 
such that X can expect, on average, to lose: for example y = (0, 1,0)” 
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if x, > 3 and x,,x,<}. However, if Y chooses (0,1,0)’ then x = 
(0,0, 1)’ ensures that X wins | every time. So, by an optimum stratagem 
for X, we mean that stratagem which maximises X’s average gain 
given that Y will choose a stratagem which minimises Y’s average 
loss. Thus we assume that both players will play consistently and 
as skilfully as possible, and both will choose stratagems according 
to this assumption. Their choice is based on an assumption about 
the nature of their opponent’s stratagem, but is independent of it 
and made without observing it. This rather subtle notion distinguishes 
two-person matrix games from similar problems in which the payoff 
matrix refers to a person-nature or a person-machine situation. 

The idea of optimum stratagems as we have defined it is only 
compatible with the zero-sum aspect of matrix games if there is a 
unique quantity v, which we call the value of the game, and which 
is both the maximum amount X can be sure of gaining on average 
and the minimum amount Y cannot avoid losing on average. The 
existence of such a v and of optimum stratagems for X and Y is 
the substance of the Fundamental Theorem of Two-person Zero-sum 
Matrix Games. (Such games from now on will simply be called matrix 
games.) A fair game is one which has value zero. 


13.2 The Linear Programming Connection 

Consider the problem of determining X’s optimum stratagem. The 
variables x,, 7 = 1,2,...,m, as they are probabilities (or proportions), 
must satisfy 

Fated, %,= 0, j = 1,2,...,2, (1) 
For those plays in which Y chooses Y’s i-th alternative, X’s expecta- 
tion, Or average gain, is 
Piece Mas® cs 

Denoting by p the average gain which the stratagem x will guarantee 
to X, we must have 


a4 Apt, =P; i= 12 Sm: (2) 

Thus X’s problem is to choose X,,X3,...,X,, P SO aS to maximise 

p= ,0,....6 0(} ) subject to the constraints (1) and (2). This is 
the /.p.p. 

maximise (0", Dp ) subject to Ax = pe, e’x=1, x=0. (3) 


The same argument applied to Y’s problem (ER) gives the L_p.p. 
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MieaVe As V0, be 12, rca (4) 
LF 1:4y = 9, J = 1,2,...,, minimise q, 


i.e. minimise 7.a(7 ) subject toy'’A < qe’, y’e= 1,y= 0, (5) 


where gq is Y’s expected loss, and —q is Y’s expected gain. 

The L.p.p. (5) is the dual of the /.p.p. (3) (ER), both have feasible 
solutions, trivially, and so both have optimum solutions, with maximum 
Pp = minimum q = v say. Thus the fundamental theorem for matrix 
games follows at once from the duality theorem for /.p.p.s, and the 
solution of a matrix game (the value of the game and the optimum 
stratagems) can be obtained by solving a single /.p.p. Notice that 
v = y.,, AX,,,, (ER). 

Instead of solving (3) or (5) by converting to canonical form, we 
first reformulate both X’s problem and Y’s problem. 

The optimum stratagems, x, and y, say, are unchanged if we replace 
a, by a,+a, i = 1,2,...,m, j = 1,2,...,n, but the value of the 
game changes, by an increase a (ER). If we choose a to ensure 
that all a,,+ a are strictly positive then the value of the game must 
be strictly positive, and so in (3) and (5) p and q are positive. Assuming 
that this has been done and denoting a, +a by a; put x/ = x,/p, 
j = 1,2,...,n, so that X’s problem becomes 

" hedntnaiie e’x’ subjectto A’x'>e, x’=0, (6) 
because e’x’ = xj, x,/p = 1/p, and X wishes to maximise p. This 
is a l.p.p. in standard primal form. 

Similarly, with y/ = y,/q, Y’s problem becomes 

maximise y’’e subjectto y'"A’<e’, y=0, (7) 
which is in standard dual form. 

In this formulation the value of the game v is given by (27_, x, i ia 
or (2/_, y/) ' and the optimum stratagems by x, = vx/, y, = vy}. 


Example 
We verify for the stone-paper-scissors game that both the optimum 
stratagems are (+, +, 4)’ and the value of the game is 0. 


With 
0 ed | 
A={-l1 0 l 
1 -1 0 


and x, = yp = (4, 4,4)’, we have 
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o= 0, ysA=0’, ie. Ax, =e, yZA <0e’, 
=0, y= 0, e'x,=yoe= 1, p=q= 0. 
Thus (3,4, yr is a feasible solution for both (3) and (5); it gives 
the same value for both objective functions and is therefore the 
optimum solution for both problems. 
Alternatively, choosing a = 2, we have 


( ang 3 ) 
ig Se a aaa (8) 
3 bind 


With x, = (¢,¢, 5)’ = y4, we have 

A’x, =e, Yo" "=e", e’xg =4, yiZe=4, so 
v= 2, Xo = Yo = 2(6,6,6) =G, 5,3)» and the value of the game 
is2-—a=0. 


13.3 Pure, Mixed, Dominated and Essential Stratagems; Saddle Point 
Games 

A stratagem x of the form x = e,, i = j < n, is called a pure 
stratagem and means that X chooses the same alternative every time; 
otherwise a stratagem in which X uses more than one alternative 
is called a mixed stratagem. 

A game in which the optimum stratagems for X and Y are pure 
stratagems is called a saddle point game and is easy to recognise. 

The optimum pure stratagem for X is the /,-th, where 


max (min a,) is attained with / = j,. 
yA 
Similarly the optimum pure stratagem for Y is the i,-th, where 
min (max a,) is attained with i = i,. 
ei 


if max (min a,) = 4,,,= min (max a,) then e, and e, are X’s and 
Y’s optimal stratagems because they are fesnitte sélutibrs for (3) 
and (5) of section (2) with p = g = a, 

This, observation leads to the alternative formulation of X’s problem 
and Y’s problem. For any chosen stratagem x, X’s expectations for 
Y’s various alternative plays are given by the vector Ax, and so 


if Y chooses the stratagem y, X’s average gain is 

\ y “AX. (1) 
The stratagem y that Y chooses will be that which minimises X’s 
average gain, so that subject to the constraints (1) and (4) of section 
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13.2, X requires x, such that 

max (min y Ax) = min y Ax,. (2) 
Similarly, Y requires y, such that 

min (max y"Ax) = max ya Ax. (3) 
The assertion that the quantities 

max (min y’ Ax) and min (max y TAx) 

both exist and have the same value, v, is the minimax theorem of 
von Neumann. We have already established this result in the previous 


section as the fundamental theorem of matrix games, and we know 
that 


V = yoAXy. 
We observe that 
min(max y’ Ax) = max y, Ax < y,Ax, < min y Ax, 
y x 
= min (max y TAX My 
y x 


and that y,Ax =< y,Ax, =< y’Ax,. 


Suppose that 

a, = ay, t= 1,2,...,m, 
then whichever play Y chooses, X never gains more by choosing 
X’s k-th alternative play in preference to the j-th. In this situation, 
we can be sure that (x,), = 0 and we may reduce the size of the 
payoff matrix A by removing the k-th column. We say that X’s k-th 
alternative, or k-th pure stratagem, is dominated by the j-th. Similarly, 
Y’s k-th pure strategem is dominated by the i-th if 

ays ayy j=l, 2,505 


in which case (y,), = 0 and the k-th row of A may be removed. 

It may be the case that after removal of dominated stratagems 
the reduced payoff matrix reveals dominated stratagems which were 
not apparent in the original payoff matrix, and significant simplification 
of a matrix game may result from the elimination of dominated 
stratagems. 


Example 
The skin-game devised by Kuhn has payoff matrix — 


( 1 -1 1) 
=| ie 
—2 1 0 
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and appears at first glance to be fair. However, X’s first pure stratagem 
is dominated by the third which reduces the payoff matrix to 


Ct -4). 


and now Y’s third stratagem is dominated by the second. The payoff 
matrix is now 
-1..2 
("i -1) 
which indicates a bias towards X. 


An essential alternative, or essential pure stratagem, is one which 
is used a strictly positive proportion of times in an optimum stratagem. 
_It is not correct that all pure stratagems are either essential or 
dominated, since a pure stratagem can be dominated by a combination 
of other alternatives, but not dominated by any one of them (see 
exercise 13.7). 

Extensions to n-person games may be found in {10}. 
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_ Exercises 13 


: 


Prove that the optimum stratagems for a matrix game with payoff 
matrix A are unchanged if A is replaced by aA for some a > 0. 
What is the practical interpretation of this result? 


. Prove that the value of a matrix game in which the payoff matrix 


is skew-symmetric, A’ = —A, has value zero. What can you say 
about the optimum stratagems for X and Y? 


. Verify that the solution of the skin-game (see section 13.3) is 


0) as.9) ve 


. Solve the matrix game with payoff matrix 


( A | “1) 
A={-1 O° —1F- 
—2 l 2 


. In the matching pennies game the two players X and Y simulta- 


neously uncover a penny: if both coins show heads or both tails 
X wins and takes both, and if they show one head and one tail 
Y wins and takes both. Solve this game. 
Suppose the payoff matrix is changed to 
3 -2 
(-2 “1) 

and Y agrees to play only if X pays Y a premium of | every 
10 plays. Should X agree? 


. Verify that the value of the matrix game with payoff matrix 


& 2 s) 
f—2 @3 
1 2 -3 


is « and the optimum stratagems are 


6 3 252 5 8 9\T 
Gistisa> 080 Gis e) - 


. Devise a matrix game in which m = 2, n = 3, none of X’s pure 


stratagems is dominated, but only two are essential. 


. The hide-and-seek game: One player, Y say, can hide in any element 


b, of an sxt matrix B; X chooses to search either a row of 
B or a column of B. If X finds Y the payoff is b,, otherwise 
the payoff is —a. Describe the payoff matrix which defines this 
situation as a matrix game and discuss Situations which can be 
modelled by this game. 
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CHAPTER 14 


FURTHER APPLICATIONS: QUADRATIC 
PROGRAMMING; FUNCTIONAL APPROXIMATION; 
MATRIX EIGENVALUE PERTURBATION ANALYSIS 


14.1 

Applications of linear programming techniques discussed in earlier 
chapters concern situations in which a mathematical description of 
the problem leads naturally to a /.p.p. This chapter is concerned 
with several problems for which one would not immediately expect 
linear programming to be useful. The characteristic feature of the 
problems is the presence of linear constraints, and the methods we 
develop rely on the fact that the simplex method involves an effective 
way of handling such information. In each case other methods using 
different approaches are available which, depending on the particular 
problem, may be more effective. 


Quadratic Programming Problems (q.p.p.s) 
Here the objective function f(x) which we wish to minimise is 
a quadratic function of the variables x,, x,, ..., x,, which we may write 


f(x) =4x’Dx+c’x, where D’=D (1) 
and the problem is to minimise f(x) subject to 
Ax = b, x2 0. (2) 


A constant term which might be involved in f(x) can be ignored, 
and any linear inequality constraints may be put in the form (2). 

We shall restrict our attention to the case in which D is positive 
definite, i.e. x'Dx > 0 if x 40. For this case, for any x 4 0 f(kx) 
eventually increases without bound as k increases so any q.p.p. defined 
by (1) and (2) has an optimum solution. 

In addition to this fact, g.p.p.s. differ from /.p.p.s by not necessarily 
attaining their minimum value on the boundary of the feasible region 
R. To illustrate this, it is convenient (as it was in chapter | for 
l.p.p.s) to consider an example in 2-space subject to inequality 
constraints. 
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The function f(x) = (x, — 2)’ + 2x, — 3) 
ne : O\/x x 
iH (1.4)(9 2)(x1) + (-4,-12)(21) + 22, 
which has no x,x, term, attains its minimum value at the point (2,3) 


and has a constant value on concentric ellipses centred on this point. 
For the set of constraints 


¥,( Ry SO, yy =(22 S51); Fj, x, ee 
the point (2,3) is an interior point of R, so the constraints are effectively 
redundant. 


xX, + %x,=6 


For the set of constraints 

£,4+4,:3:3, 49 Fae ee 
the point (2,3) is not in R and the minimum value of f(x) subject 
to these constraints is attained on the boundary of R, at A. 
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x, +x, =3 


The optimum solution of a q.p.p. is characterised by the following 
result: : 


Theorem 16 
The vector x, solves the q.p.p. (1), (2) if there are vectors A, 
and yy such that xy, Ay, fy Satisfy 
(i) Ax, = b, x, = 0, 
(ii) Dx, + ¢ + A’A, = w, = 0, 
(iii) pox, = Ow 
To establish this result we suppose that x,, A, and yp, satisfy (i), 
(ii), and (iii) and we consider any other feasible vector x. 
S(%) — f(%) = 3 x"Dx + 7x + 5 xgDx, — €7x, 
= +(x — xy)’ D(x — xy) +c” (x — x,) + x, D(x — x,). 
As Ax, = Ax = b, A(x—x,)=0 and Aj A(x—x,) =0. 
Hence 
F(%) — £(%) = 3 (K = Xp)" D(x — x9) + (C7 + xD + AGANX — Xp) 
= 3 (K — Xq)" D(x — x) + mo(x — XQ). 
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As D is positive definite, p, = 0, x = 0 and p? x, = 0, 
f(x) —f(%) = Om 


This result is just the Kuhn-Tucker Theorem for constrained optimisa- 
tion applied to the q.p.p. (see {10}, {12}, {13}, (14}). 

If we regard x,, X,, ...,X,5Ayy Ag, ---sAjgs Ms Me zs ---y M,, AS Variables we 
have (m + n) linear constraints (i) and (ii), together with a non-linear 
constraint (iii) which, as x = 0 and » = 0, says that not both of x, 
and yw, may be non-zero, for j = 1,2,..., 

Suppose we find a basic feasible solution of (i) and we satisfy 
(iii) by saying «,=0 if x, is a basic variable. Substituting in the 
m equations of (ii), which correspond to 4, = 0 determines A, because 
the m columns of A corresponding to the basic variables x, are 
independent. This leaves (nm — m) equations of i which determine 
the remaining (n — m) variables p,, j = 1,2,...,, for x, not basic, 
since D, X, ¢, A’, A are now all known. 

The solution x, A, p thus obtained does not necessarily satisfy 
(ii) because we have not ensured » => 0. However, with p =u—v 
where u, v = 0, the L.p.p. 


minimise v,+v, +... + v, subject to (i) and (ii) 
provides a solution satisfying (i), (ii), and (iii) if xu = 0 and Vopr = 0 
and thus can be used to solve the q.p.p. 


14.2 

To convert the /.p.p. developed in section 14.1 to canonical primal 
form we have only to put A = s — t wheres, t= 0. 

Assuming that we have already found a b.f.s. of Ax = b or, more 
conveniently, that A D I, then the /.p.p. 


minimise (0", 07,07, 0’, e’)/x 


<S-oa 


subjectto Ax=b, Dx + A’s — A’t—u+v=-c, 
x,8,t,u,v=0 
can be solved directly by the simplex method, with the modification 
that, for j = 1,2,. My x, and u, may not both be basic variables. 
This ensures that u’x = 0 at every stage, and as we will have v = 0 
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at the optimum stage, 4’x = (u—v)’ x = 0 at the optimum stage. 
The (m +n) X (n+ m+m+n-+n) matrix of coefficients is 

2 0 0 0 

D A’ -A’ -I 

This is easily converted to the required form by adding multiples 

of the first m rows to the last n rows to reduce to zero the columns 

of D corresponding to columns of I, in A. This does not affect the 


matrices —I, and I, in the last n rows, so that the right-hand-side 


vector, originally <) now (2 , can be made non-negative by 
multiplying appropriate rows by —1. Thus initially the (m+n) 
basic variables will consist of m of X,,X,...,X, and those n of 


U,, U2, ...,U,,¥\, V5...) ¥, Which now correspond to columns of I, , ,. 


0 
vy’ where A D I. 


Example 
Minimise x} + x3, — 8x, — 10x, 
subject to 3x, + 2x, <6, x,, x, 20. 
Adding a slack variable x, to produce an equality constraint 
3X, + 2%, + 4, = 6, X), Kok, =O, 
we have m = I, n = 3, 


20 0 - 8 
A = (3,2,1), D= (3 2 0), c= (=0), 
000 0 


An initial b/f.s. is x, = x, = 0, x, = 6, hence mu, = 0 and 
(#), — (Dx), — (¢); = (A’A),, 


i.e.0-0-—0O= 1 A,, SoA, = 0, and then the first and second rows 
of 


Dx+c+A’=p 
give 
bw, = —8, pw, = -10., 
The initial tableau is 
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“” 


HloN|S loo 


Here the basic variables are x,, v,, v,, ¥, indicated by *, and the 
usual simplex procedure leads to the introduction of s, into the basis 
and the removal of v, as indicated. 


At this stage the usual simplex procedure would choose a pivot in 
the u, column arid make u, a basic variable with value }. This cannot 
be allowed as x, is a basic variable. We could make u, a basic variable 
if x, were the basic variable to be replaced, but that is not the case 
here; instead we choose a pivot in the x, column corresponding to 
the next largest negative e.c.c. This leads to the following tableau. 


Again the largest negative e.c.c. corresponds to y,, but this time, 
as x, is no longer a basic variable, we can make pp, a basic variable. 


§14.3 FURTHER APPLICATIONS 187 


ig hoc 0 0 0] 0 3 
044 hoi. 0 
De 5 cick Oars es 
4 
0-4 4 +0 0} $ 
0% - 31-10} 3 i 
“” 
03 ;-3 0 
0-5 tk sae 
2 ah MS 
” e Tr 2«% (0 
0-5 = = 0 


» 


All e.c.c.s are now non-negative and v = 0 so we have the optimum 
solution, which is 
4 33 
X= = % =0, uw, =p, =0, 
332 =i 32 
M3= A, =%3- 
It can be shown that, for D positive definite, the method described 
above cannot terminate unless the vector v has value zero. 
The above approach to quadratic programming problems is due 


to P. Wolfe. For further information and other methods see {9} and 
{10}. 


14.3 Functional Approximation 

A central problem in mathematics is to approximate a given function — 
f(x) by a simpler function p(x) of specified form. Typically p(x) is 
a polynomial of degree n, 

P(x) =pot+p,x+t+...+p, x", 

but we may just as easily consider the more general case where, 
instead of a linear combination of powers of x, p(x) is a linear 
combination of some chosen basis functions $,(x), @,(x), ..., ,, (x). 

If we regard 


e(x) = f(x) — p(x) 
as the error in the approximation, then e(x) is a linear function of 
the parameters p,, p,,...,p,,, and what is usually required is the best 
choice of p = (py, p,,..., P,)', namely that which makes e(x) as small 
as possible. 
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The solution of such a problem depends on the way in which we 
measure the size of the function e(x). The measures commonly used 
in practice are the L,, L, and L,, function norms defined by 

lle], = S2 le@lax, 
Nell, = (82 (ey dxy'” 


and je(x)||,, = max |e(x)| 
asx=6b 


respectively, where a < x < bis the interval on which the approximation 
is required. 

When f(x) is known at only a finite number of points x,,x,,..., 
x,,, we have a discrete approximation problem and the corresponding 
measures of the error are the L,, L, and L, vector norms |le|j,, |lell,, 
llel|_, defined by 


zi; 16; noi e;)'’’, max |e,| respectively, 
\=/=sm 


where e, = e(x,),j = 1,2,...,m. A discrete function approximation 
problem is often used to provide an approximate solution of the 
corresponding continuous function approximation problem. For further 
information on functional approximation see {6}, {7}, {10}. 

For L, approximation an explicit expression for the best approxima- 
tion is available in both the discrete and continuous cases. When 
the parameters p,,p,,...,P, are constrained to lie in some given 
intervals the best approximation can be found by solving a q.p.p. 
as described in exercise 14.2. 

For discrete L, approximation and discrete L,, approximation, the 
best approximation can be found by solving a /.p.p. 


14.4 L, Approximation 
For any approximating function p(x) we define 


sothat |f(x) —p(x)| se i= 1,2,...,m, 


f(x) aren ‘ 
or , te p2igm (1) 
f(x) — P(x) = —e 


Since p(x,) = Pobo(x,) + P.O, (x,) + ... + p,,&,,(x,) and e are unknown, 
we write the constraints (1) as 


P(x) + ipl [ 
PG es Ie hs Pe 
P(x,) 7 ye = f(x;) 


and the approximation problem becomes the /.p.p. 


(2) 
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minimise e = (0, 0, ..., 0, 1) (2) subject to the 


2m linear constraints (2). 
To solve this .p.p., if we put 


(8) = Gy» G) 
ee “| min ?,) and 
Jeti 2....5 n 
P,<9 
Xo =P, + kay J 2m, 


then p(x) becomes 
Xo Po(X) + ¥,h,(x) + ... + X60) — ¥,,,, Zj-0 $,(x), 
and the constraints (2) become 
AgXo + Qy% + 0. + 4,54 4,1)%), te zs, (4) 
AgXo + A,X, +... + Ay k, + Ans iknys — @ SS, 
for i=1,2,...,m, wheref, =f(x,) and 


Ginsi oy ee oe $,(x,). ° (5) 
The 2m constraints (4) now involve (n + 2) non-negative variables 
Xo, X15 -...%,, € SO we have a /l.p.p. in standard form if we multiply 


the second constraint of (4) by —1. 

As we will usually have m >> n we turn our attention to the dual 

problem, which is 

maximise ~" ,uf,— =", vf, subject to 

uA —vA<0, we+ve<l1, u,v=0, (6) 
where A is the m x (n + 2) matrix defined by (3) and (5) and e is 
the m-vector (1, 1, ..., 1)”. 

The /.p.p. (6) involves (n + 3) constraints in 2m non-negative variables 
but as, by the definition of x,,, and a,,,,, the (m+ 2)-th constraint 
is equal to the sum of the first (n+ 1) constraints multiplied by —1 
we have one redundant constraint in the dual problem. 

Solving this /.p.p. by introducing (n + 3) slack variables will involve 
eliminating one of the (mn + 3) equations as described in section 4.5, 
so that the optimum solution of the primal will satisfy (n +2) of 
the primal inequality constraints as equalities. These (n + 2) constraints 
must correspond to distinct points x, (ER) so we see that the error 
in the best approximation will attain its maximum value at least (n + 2) 
times. This corresponds to the celebrated Chebyshev Equioscillation 
Theorem which characterizes best L,, approximation and states, for 
the continuous case, that 

p(x) is the best approximation to f(x) on a < x < b in the sense 

of the L, norm if and only if f(x) — p(x) attains its maximum 
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magnitude at (n+ 2) points (at least) in a= x <b, and that 
J (x) — p(x) is alternately positive and negative at each pair of 
adjacent points. 
An extensive discussion of L, approximation is given in {10}, and 
details of an efficient algorithm in {17}. 


14.5 L, Approximation 

Again we denote /(x,)—p(x,) by e,, i= 1,2,...,m, and as e, is a 
variable whose value is to be determined but which may be positive 
or negative, we replace it by the difference of two positive variables 

e,=2z,—w,, i= 1,2,..., m. 

The objective function £7" , |e,| will be minimised when 2” , (z, + w,) 
is minimised (ER). 

So the L, approximation problem becomes 
minimise 7", (z, + W,) 
subject to Xi_o(P, — 9)O,%)+2,-W, =f, §=1,2,...,m (I) 
Gnd Ph. Pis+++s Dav Vos Tassos Gas yrrhas so 2a, Mpaitens bee 
where the variables p,, j= 0, 1, ...,m, have been written as the difference 
of two positive variables p/, q;. 

With $,(x,) = a, the l.p.p. (1) may be written 


minimise (0’,0’,e’,e’) ( ‘) 


Z 
w 
subject to (A,—A, I,,,-I,, P, = b, P, > 0, 
( 4 q Q) 
w w 


where b, = f,, i= 1,2,...,m, p’ and q’ are (n + 1)-vectors and z and 
w are m-vectors. 

The dual of (2) is a bounded variable l.p.p. (see exercise 14.3). 
Although there is a modification of the simplex method to solve 
bounded variable problems directly (see {9}), the special form of 
the constraints of (2) have led to the development of an algorithm 
for solving the primal directly, using only the m xn array A of . 
coefficients (see {18}). 

As both L, and L, approximation problems have been expressed 
as |.p.p.s., we observe that similar problems in which there are linear 
constraints on the coefficients p, of the approximating function 


P(x) = Poho(*) + Ph, (x) 2 a een i PP, 
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can also be solved by simply including these extra constraints in 
the L.p.p. 


14.6 
An interesting application of linear programming is provided by 
the Wielandt-Hoffman Theorem on the eigenvalues of symmetric 
matrices. If Ais an n X n symmetric matrix (A = A’) with eigenvalues 
A,, Az, .... A, and B is an n X nm symmetric matrix with eigenvalues 
My, Ma,--..M, then provided that the order of the p, say, is suitably 
chosen, 
271A, — 4) = A — BYP, (1) 
where ||A|] denotes the matrix norm defined by 
AE = =°... a;, = trace (A’A). 


‘j= 
The most immediate application is to the situation in which 
B =A + BOA, and then the result gives information about the perturba- 
tion of the spectrum of eigenvalues of A when A is perturbed. 


For any symmetric matrix A there exists an orthogonal matrix Q 


(Q-' = Q’) such that Q’AQ = D, where D is a diagonal matrix 
whose diagonal elements are the eigenvalues of A. For the matrix 
norm defined above, ||Q’AQ\| = ||Alj for any orthogonal matrix Q, 


and if A is symmetric so is Q’AQ. 
Thus if Q{AQ,=D, and Q7(Q7BQ 4)Q, =D,, then 
A — BI’ = JD, — Q,D, Qa] = ee. QD,.Q"’. 2) 
Now the set of orthogonal matrices Q includes all permutation matrices 


P (see section 3.7), and if we prove that the minimum in (2) is attained 
at a permutation matrix then 


Q'D,,Q is just D,, with its diagonal elements re-ordered, so that 
. y een ms 2 
inin ID, — VD,QV = 27, A, - 4, (3) 
and we will have established the result (1). 
The problem 
minimise ||D, — QD, Q’||? subjectto Q'Q=1 
can, surprisingly, be rewritten as a /.p.p., because 
|D, a QD_,Q* |’ trace (D, % QD,,Q")"(D, az QD,,Q"), 
trace (D{D, + QD? DQ’ — D,QD,Q* 
— QD‘Q’D,), 
= trace (D{D, + DZD) + £(Q), 
where /(Q) = trace (—D,QD,,Q’ — QD,Q’D, ). (4) 
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If Q has rows q),q.,...,q,, then the rows of D,Q are A,qj, 
A,q2,....A,q’ and the columns of DQ’ are D,.q,,D,.q,, ..., D,4,- 
Hence 

SQ = —(7_,4,47D,4, + 27.4/D,A,4) (5) 
z= titra za AVP Vy it Lint Li Vy BAW) 
= —2 (27 Aut, q;)- 
Since Q’Q = I, 3 qi, = ¥ 4, = 1, tf = 1,2,...,n, so writing 
q;, = X, and 2A,u, = r, we see that the problem 
minimise ||D, — QD,,Q” ||’ over the set of matrices Q such that Q’Q = I 
becomes 


maximise — f(Q) subject to Q’Q = 1, or 
maximise %,r,x, subjectto %,x, = %,x,= 1, x, 29, 
OF Ie Br Seeery | (6) 
This is precisely the marriage problem version of the transportation 
problem which, as we saw in chapters 10 and 12, is solved by a matrix 
X which is a permutation matrix, and so the assertion (1) is established. 
For a slightly more general result on matrix eigenvalues see {19}. 
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Exercises 14 
1. Use the modified simplex method described in section 14.2 to solve 
the quadratic programming problem 


minimise 3x, + 2x; + 2x,x, — 18x, — 16x, 
subject to (i) x, + 2x, = 10, x,,x,=0, 
(ii) 2x, + x, = 5, x,,x, = 0. 
2. It is required to find the best quadratic approximation 
P, + p.x + px’ to a function g(x) whose values g, at N points 


X,,%2, ....X%, are known, where the coefficients p,, p,, p, must 
satisfy 

b,=p,= 6b, 

b,=p,=)b, 

b, = p, = b, 


and where best is to be interpreted as that which minimises the 
sum of squares of the residuals 
rh (8(x,) — (Pp, + Px, + a) 
Formulate this problem as a q.p.p. and explain why it always 
has a solution. 
3. Show that the /.p.p. (2) of section 14.5 always has an immediate 
initial b.f.s. 

Obtain the dual of this /.p.p. and verify that it is a bounded 
variable |.p.p. in which the usual non-negativity constraints on 
the variables are replaced by intervals in which they must lie. 

4. For an overspecified system of linear equations 
Ax=b, where A is mxXn, m>n, 
we cannot expect in general that there is a solution x satisfying 
the equations. Denoting the residual by r, 
Ax —b=r, 
we may instead seek the best solution vector x in the sense of 
minimising r. Formulate the /.p.p.s for obtaining the best solution 
x 
(i) using the L, vector norm of r, 
(ii) using the L,, vector norm of r. 
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APPENDIX 1 


PROOF OF THEOREM 2 (Section 2.6) 


Assume that the L.p.p. is in canonical form. Let x, be a point 
at which e’x is minimised, and let ¢’x, = fy. 

We may assume that R is bounded, because if it is not we may 
add to the set of constraints the constraints x,< K, j = 1, 2,...,n, 
where K is any sufficiently large number, e.g. K = max oe: This 
will change R but not the solution of the /.p.p. Phere 

The point x, belongs to the hyperplane H, 

H = {x|c’x =f). 
Since H and R are closed convex sets and R is bounded, 
T= HQ R is aclosed bounded convex set. 


Therefore we can define a sequence of sets T, X,, X,, ..., X,,, each 
closed, bounded and convex, and contained in the previous one, as 


follows: 
X, = {x* |x* = min x,}, 
xET 


X, = {x* |x} = min x,}, 
xEX, 


X,, = {x* |x* = min x,}. 
xEX,_) 


Now, X,, is not empty because T is not empty (x, at least belongs 
to T). In fact, X,, contains a single point, y say; for suppose y and 
z belong to X,,, then by the definition of X,, y, = z,. By the definition 
of X, 1, ¥,-, = Z,—, and so on until y, = z, and thus y = z. 

The point y is an extreme point of T (ER) and hence is an extreme 
point of R. For suppose y is not an extreme point of R, then there 
are x, and x, € R such that 

y = ax, + (= a)x,, where x, #x,, 0< a< bh 
if x, @ T then c’x,>/f, because cx =f, for any x € R and 
ex, # cy. Also ¢’x, => c,. Therefore ¢7 y = ac’x, + (1 — a) e’x, 


i.e. c’y > af, + (1 — afi =f, 


195 
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which contradicts the definition of x,. Therefore x, € T and similarly 
x, € T. But this implies that y is not an extreme point of T which 
is something we (!) have already established. Therefore y is an extreme 
point of R and since y € T, e’y =f, and we have the required result. 
_ Some comments about this rather tortuous proof are appropriate. 

It uses the fact that a continuous function (x, is a continuous function 
of x) attains its maximum and minimum values over a closed, bounded 
set at a point of the set, and so we need R to be bounded. In practice, 
in the simplex method, we do not need to make sure that R is bounded 
and so we do not need to choose K and add in the extra constraints. 
For this reason we do not now have to investigate whether y is one 
of the extreme points of the original R or an extreme point of the 
new R created by the extra constraints. 

The theorem is not constructive: it does not provide us with a 
practical way of finding x, or y. This is partly because the definition 
of an extreme point of R is not a constructive one. 


APPENDIX 2 


DUALITY THEOREM: THIRD PROOF 


We first prove the theorem of the separating hyperplane, in a 
somewhat abstract setting, and then use it to establish the existence 
of an optimum solution of the dual, given that the primal has an 
optimum solution. We take advantage of some simple results estab- 
lished during the first proof, but these do not rely on the simplex 
method and are easily established independently. 


The Theorem of the Separating Hyperplane 
Let S be a closed convex set and let b be any point not in S, 
then there is a vector y such that 
yb < inf y’z. 
zES 
To establish this result, let 


= inf \z — b|], where the vector norm _ |lz|| = |z\|, = (2’z)'/?. 
zeES 


As b € S, 5 > 0. As S is closed and 
inf \jz — bl = inf |lz — bl| for K sufficiently large, 
zEes ha 
and {z|z € S, ||z||< K} is compact, there exists 
z, © S such that |jz, — b]| = 5 
Put y, = z — b. We show that y, is a satisfactory choice for 
y above. For 0 < a < | and anyz € S, 
az + (1 — a)z, € S, i.e. z, + a(z — 2,) € S. 
Therefore I. + a(z — 2) — bjl; = Iz, — bil} and therefore 
2a(z, — b)"(z — Z) + a'(z — Z)'(z — 2) = 0 and considering this 
result as a — 0, we see that we must have 
(2) — b)"(z — z,) = 0, 
i.e. (z, — b)’z = (2, — b)’z, 
a (Zo Be b)’b co (Z, a b)‘(z, a b) 
= (z, — b)’b + 3’. 
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So putting z,—b=y, we have y’z => y’b + S& for any z € S, 
i.e. yb < inf y’z. 
zESs 
Exercise: prove that this result implies that 


for any matrix A and any vector b, 

either (i) there is a vector x = 0 such that Ax = b, 

or (ii) there is a vector y such that y’b < 0, y’A = 0’, 
by putting S = {z|Ax = z, x = 0} and deducing that 
(i) false implies (ii) true, using inf y’z = inf y’Ax =< 0. 


To establish the duality theorem, consider the /.p.p. in canonical 
primal form 
minimise c’x subjectto Ax =b, x=0. (1) 
Suppose that an optimum solution x, exists, and put e’x, = f,. Define 
a set S of (m+ 1)-vectors z as follows: 


oe, 7 Z. 
PP cope nya’ 


Zm+1 
S = {z|z, = tf, =c’x, z, = th — Ax, t= 0}, 
i.e. any t = 0 and any x = 0 define a vector z in S. 


Exercise: prove that S is a closed, convex cone. 
We now show that 4 ¢ S,i.e. z, = 1, z, = O implies that z € S. 
Suppose z, = 0 = t*b + Ax* and z, = | = ¢*f, — e’x*, for some 
l 
t* > O and x* = 0. Thenif t* = 0, es x* is feasible for the canonical 


primal /.p.p. (1) and 


l 
ie. 7 (4 ) <e’x,, which is a contradiction. 


Alternatively, if ¢* = 0 then e’x* = —1 and Ax* = 0 with x* = 0, 
so A(x, + ax*) =b and S(% + ax*) =f, —a<f, for a <0, 
which again contradicts the definition of x,. 

So (0) ¢ S, and (0) can take the role of b in the theorem of 
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the separating hyperplane. Therefore there exists an (m + 1)-vector 


(y) say, such that 


e Fy l T zZ, ll Z, 
(a,y (0) <@y \(z:) for a (z: E S, 
i.e. a < inf(az, + y’z,) = o say, and o < 0 because (0) e: 8; 
zEs 
but if for some z* = s which belongs to S§ it is true that 
2 

(a,y’)z* < 0 then (a,y’)kz* can be made arbitrarily large and negative 
by taking k sufficiently large and kz* € S. 

Hence (a,y’)kz* < a for k sufficiently large, which contradicts 
the assertion about (a,y’) in the theorem of the separating hyperplane. 

Therefore o = 0, and therefore a = 0. Hence any a < 0 will 
suffice and we can choose a = —1. 

Thus there exists a vector y such that 


Ly(9) <(-ly’)z forany z€S, 

and therefore, as S is a cone, (—l,y’)z = 0 for this y. 

That is, there exists an m-vector y such that 

—z,+y'z,2=0 forallz ES, 

therefore —if, + c’x + ty’b — y’Ax = 0 for all ¢ = 0 and x = 0. 

So t(y’b — f,) + (c” — y"A)x = 0 for all ¢ = 0 and x = 0, 

ie. y’'Axc’, 

SO y satisfies the dual constraints, and y’b = io 


But we know that for any x and y satisfying primal and dual 
constraints respectively that 


y’b=<c’x, thatis y’b</f, (see (5) of section 5.4). 
Therefore y’b = f,, so y is an optimum solution for the dual /.p.p. 
with the same optimum value as that of the primal. 
This approach to the duality theorem follows that in {12}. 
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APPENDIX 3 


SOLVING SYSTEMS OF LINEAR EQUATIONS: 
GAUSSIAN ELIMINATION WITH INTERCHANGES; 
TRIANGULAR DECOMPOSITION 


Throughout this book the examples are solved using exact arithmetic 
operations. This is very convenient for pedagogic purposes since the 
theoretical development assumes that this is the case. In practice 
however the arithmetic operations that computers perform are slightly 
inaccurate, for example | divided by 7 has an infinite decimal and 
binary representation and so the result cannot be stored exactly; also 
the product of two f-digit numbers usually has 2¢ digits and so this 
product cannot be stored exactly in a f-digit computer. All numbers 
stored in computers and the results of arithmetic operations are 
represented by numbers with a fixed number of digits so that input 
data and the results of calculations have to be rounded off. These 
arithmetic errors are equivalent to a perturbation of the problem being 
solved, so that given ann X n matrix A and an n-vector b, whichever 
method is chosen to solve the system of equations Ax = b, we obtain 
not x but a computed solution which we call x, and for which 

Ax, # b, but 
(A + 5A)x, = (b + 5b), 
where x., 5A, db depend on the method used as well as on A and 
mS 

An acceptable method is one which is both efficient in terms of 
the total number of arithmetic operations it requires, and accurate 
in the sense that 5A and 5b are small. In practice, 5A and 6b are 
unobtainable but, for any particular method, bounds for the possible 
magnitude of 5A and 5b can be found, so that a better method 
is one for which these bounds are smaller. Note also the stress 
on the difference between (A,b) and the system actually solved 
(A + 6A, b+ 5b), rather than between x and x... 

The natural way to solve Ax = b is to systematically eliminate 
variables from equations by elementary row operations until the 
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resulting system is triangular, and then to obtain the elements of 
x, by successive substitution. 

Denoting the given system by (A‘”,b"”) and assuming a‘) 4 0 we 
add 


(m,, X Ist row of (A“?,b"”)) to the i-th row for i = 2,3, ...,”, 


c (Dy, 
where m,, = —@,,/a;, - 


Thus GP BSS may pe Balm 
"a - ret Pe U2). 6 (1) 
and a’ =0, bm 2:3. 
In general at the k-th stage we have (A“’,b”), 
where Pe j= l,2,...skot, iz 
andthen a =a +maD, if = k+l k+2y..5m, 
where m, = —a? /a®, i=k+1,k4+2,...,n, (2) 
a eal, f= hEPW..n, 
and a+. O, i=k4+1,k42,..52. 


After (n—1) such elimination stages we have (A‘’,b’”) which we 
denote by (U,b’), where U is an upper triangular matrix, u,, = 0 if i > /. 


The solution of Ux = b’ can be obtained by back-substitution 

Xn = am 

Rie Opt Bpage eg itip obs motdsiage 2p2. ach: (3) 
This method of obtaining x is called Gaussian elimination and as 
described is not satisfactory in general. 

The elements a‘'), aS’, ..., a“ which, at the end of the elimination, 
we have renamed u,,,u,,,...,u,,, are called the pivots. They play 
acrucial role in the process, appearing as divisors during the elimination 
and in the back-substitution. It is clear from (2) and (3) that the 
process breaks down if a“)? = 0, and from (3) we see that x, can 
at best be as accurate as u,,. If a) is very small then its relative 
error due to the inexact arithmetic operations is likely to be significant, 
so the process must be modified to avoid small pivots if possible. 

Before this, notice that if L, denotes the elementary lower-triangular 
matrix 


“A (4) 
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then a er) = Les, (5) 
and Liat) «ss bg (A,B). = (U,b’). 
Also 
1 
i 
L;'= | 6) 
, mM, k 
My 2.x 
=i, | 


and (L,_,...L,L,)"' = (L,'L;' ... Lz',) = L say, where L is a 
lower-triangular matrix whose k-th column is the k-th column of L,', 
k = 1, 2,...,n— 1, and whose n-th column is e, (ER). Hence A‘? = LU 
and A has been decomposed into the product of a unit lower-triangular 
matrix L and an upper-triangular matrix U. Such a decomposition 
is unique, but can be obtained in other ways. Once we have such 
a decomposition, a system of equations can be solved directly by 
a forward-substitution and a back-substitution. 
With A = LU, 

Ax = b = LUx = Ly say. (7) 
We obtain y from Ly = b, and then x from Ux = y. 
To make the elimination process satisfactory in practice, we perform 
a row interchange at each stage to bring into the pivotal position 
the largest of the numbers a“), i = k,k + 1,...,n. Thus if 


] 
(k) (k) 
ik 


max @, =a, 


f=k k+,....0 
at the k-th stage, we interchange the k-th and the s-th rows before 
performing the eliminations defined by (2) or (5). 
This ensures that the magnitude of all multipliers m, is at most 1. 
The interchange can be represented by pre-multiplication of (A“’,b“’) 
by a permutation matrix P,, so that the whole process, called Gaussian 
elimination with interchanges can be represented by 


L,_,P,_, --- L,P,L,P,(A“’,b°”) = (U,b’). (8) 
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Example 
Here n=3 and we use two-digit arithmetic throughout. 


24 -—32 .18 10 
(A,b")=[ 94 -.95  .56 SS i. (9) 
—46 .36 -.20 |! —.30 
01 0 1 0 0 
P,={1 0 0], L;=[{ -.26 1 Of, 
001 49 0 1 
94 -.95  .56 55 
(A?,b”) = Oo 207 .03 —.04 }, 
0 -.11 07 | -.03 
100 yy & a 
P,={0 0 1],L,=]0 1 Of, 
010 0 -.64 | 
94 -.95  .56 55 
(A, b®) = ee .07 — .04 
0 0 -.015 | —.021 


The back-substitution gives 
x, = 1.4, x, = (.04—- .07 x 1.4)/.11 3 1.2, 
x, = (.55 + 95 x 1.2 — .56 x 1.4)/.94 > .92. 

The exact solution is x, = x, = x, = 1.0, so the computed solution 
may seem unsatisfactory. However, the system (9) is very sensitive 
to perturbation, and without the interchange given by P, the first 
stage elimination with .24 as pivot and —3.9 and 1.9 as multipliers 
yields (in two-digit arithmetic) 


.24 —.32 18 .10 
(A”,b”) a 0 25 —.14 —.16 , (10) 
0 —.25 14 =. 


in which A” has rank 2, and the equations are inconsistent. The 
sensitivity of the system (9) is caused by the fact that the three 
equations which define x are independent, but only just independent: 


—a,, + 2 a,,'= (.94,—1.00,.56), a,, = (.94,—.95,.56) 
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so that A“” is ‘‘nearly singular’’. If a‘) is replaced by .295, the resulting 
matrix is exactly singular. From a geometrical point of view, each 
equation represents a plane in 3-space so that the first and third 
equations restrict x to those points on their line of intersection. This 
line passes through the plane defined by the second equation, and 
so defines a unique point x, but this line is very nearly parallel to 
the plane, so that a relatively small perturbation could result in a 
‘relatively large perturbation of the point x. 

We can also see a reason why large multipliers should be avoided. 
If the i-th row (a,,,b,) is replaced by 

(a,, »b,) + m(a,,,5,), 

where m is large, we will have replaced the i-th row by another 
row which defines a plane (hyperplane for general n) nearly parallel 
to that defined by the k-th row. The important information contained 
in the i-th row will now be in only the least significant digits of 
the new i-th row, and the most significant digits will be equivalent 
to those of the k-th row. 

For example, consider n = 3, three-digit arithmetic and the three 
equations defined by 

00100 .111 rr T 222 
.800 .888 —.888 | .800 ], (11) 
0 0 ALL UI 


which are satisfied exactly by x, = x, = x, = 1. 


With a row multiplier m = —800 we obtain 
00100 .111) ) .ALL | £223 
0 —87.9 89.7 178 (12) 


0 0 eer a Vite 
and multiplying the second equation by —.111/87.9, so that we can 
compare it with the first equation, we obtain 

OG. cken.. shh. | ake, (13) 

The plane defined by (13) is still distinct from that defined by 

the first row of (11), but is now very nearly parallel, so the line 
defined by their intersection is much more sensitive to small perturba- 
tions than the line defined by the intersection of the two planes defined 
by the first two rows of (11). 
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The sensitivity of the system (9) is demonstrated by the small pivot 
aS), because a small perturbation in a‘) could result in a much larger 
proportional change in a‘). 

The ‘cost’ of the Gubntinn elimination algorithm, in other words 
the amount of work or computer time the algorithm requires, clearly 
increases with n. To see the manner in which the cost increases 
with m we evaluate the number of arithmetic operations involved, 
for example multiplications. At the k-th stage, from (2), we need 
one division (to obtain 1/a“)). Then the new i-th row, for i=k+ 1, 
k +2, ...,n, requires one multiplication for the multiplier m,, and (n — k) 
multiplications m,a{), for j=k+1, k+2,...,n. The total number 
of multiplications at the k-th stage is therefore (nm — k + 1)(n —k) and 
the overall total is 

Lie (n—k + In —k) = eta t 
= (n— 1)nQn—1)/6 + (n—1)n/2 = n° /34+7n/3. 

This is a cubic polynomial in n. The dominant term is n° /3, compared 
with which quadratic and linear terms are unimportant, so we can 
say the elimination requires essentially n’/3 multiplications. The 
operations which convert b“” to b” require essentially n? /2 multiplica- 
tions, so does the back-substitution (ER). The corresponding numbers 
of additions /subtractions are also essentially n’/3, n’/2, n’/2 (ER). 
Overall the number of arithmetic operations required to perform the 
Gaussian elimination algorithm is a cubic polynomial in n (essentially 
n’/3) so we say it is a polynomial-time algorithm. Notice that for 
several right-hand-side vectors b, only the operations on b and in 
the back-substitution have to be duplicated, not the elimination 
operations. 

In the two-part simplex method, the arithmetic operations of the 
first stage effectively reduce an m X m submatrix of A to the unit 
matrix. This is the Gauss-Jordan elimination and is not generally 
recommended for solving Ax = b in practice, even with interchanges, 
because it requires about 50% more arithmetic operations than reduc- 
tion to triangular form. At the k-th stage of the Gauss-Jordan elimination 
multiples of the k-th row are added to rows 1,2,...,k—1 as well 
as to rows k+ 1, k+2,...,n. This is equivalent to pre-multiplication 
by the matrix E (see (14) on page 207). 
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| mi, 


E = (14) 


where m, = —a‘? /a,i=1,2,....k—1,k+1,k +2, ..., 0. The overall 


result is a diagonal matrix A“. 

If in addition we replace (E),, by 1/a\) we have exactly the matrix 
E¥ of section 3.7, and A“ will be a unit matrix. 

At each stage of the simplex method as described in chapter 3, 
once the pivotal column has been chosen (i.e. once a negative e.c.c. 
c; has been chosen) the pivotal element is prescribed and cannot 
be chosen to minimise the effects of arithmetic inaccuracies. This 
could lead to a seriously inaccurate solution as the example (9) shows. 
The example uses two-digit decimal arithmetic whereas most computers 
and calculators use binary arithmetic equivalent to somewhere between 
six-digit and fourteen-digit decimal arithmetic. The higher accuracy 
does mean that in some examples interchanges could be dispensed 
with and multipliers larger than | used, but it cannot in general be 
relied upon to avoid the problem of unsuitable pivots in the simplex 
method. 

However, as we saw in section 7.3 we can regard each simplex 
Stage as solving three m Xm systems of linear equations involving 
one matrix of coefficients. 

Writing the three systems (2) of section 7.3 as 

Ax, = b,, Ax, = b,, A’x, = b,, (14) 
if we obtain L and U such that A = LU but with L"' and U actually 
present as in (8) with A = A”, then x, and x, can both be obtained 
by a forward- and a back-substitution as in (7). 

If A = LU, then A’ = U’L’, and A7(L7) | = U’. 

As (L’)"' = (L")’, and P? = P,, 

(A DP LIP, LS ... BL Ae, = (072), 
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and so the arithmetic operations for obtaining L and U only have 
to be performed once. 

This approach to the simplex method, incorporating an interchange 
strategy, is safe and satisfactory from a numerical point of view. 
It also involves very little extra work compared to the tableau approach 
because at each stage the matrix A is that of the previous stage, 
with one column changed, and it is possible to make use of the L 
and U we already have. These can be updated rather than completely 
re-computed as described in {12} and in section 1, chapter 2 of {8}. 

For an extensive discussion of the material in this section, see 


any of {4}, {5}, {6}, {7}. 


LIST OF THEOREMS 


Theorem 1, on feasible solutions of I.p.p.s, is in section 2.5. 

Theorem 2, on optimality at extreme points, is in section 2.5 and 
Appendix 1. 

Theorem 3, on extreme points and b./.s.s, is in section 2.8. 

Theorem 4, the fundamental theorem of linear programming, is in 
section 2.8. 

Theorem 5, on finite termjnation of the simplex method, is in section 
3.5. 

Theorem 6, on canonical and standard form, is in section 5.2. 
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Theorem 8, the duality theorem, is in section 5.4, 6.1, and Appendix 
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Theorem 9, the equilibrium theorem, is in section 5.6 

Theorem 10, on the separating hyperplane, is in section 6.3. 

Theorem 11, on the validity of the ellipsoid algorithm, is in section 
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